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CANINE CORONAVIRUS S GENE AND USES 
THEREFOR 


CROSS-REFERENCE TO RELATED 
APPLICATION 


This is a continuation of allowed U.S. application Ser. No. 
08/331,625, now U.S. Pat. No. 6,057,436 filed Nov. 23, 
1994, itself the U.S. national stage of PCT/US93/04692, 
filed May 7, 1993, which is a continuation-in-part of U.S. 
patent application Ser. No. 07/880,194, filed May 8, 1992 
now abandoned which is a continuation-in-part of U.S. 
patent application Ser. No. 07/698,927, filed May 13, 1991, 
now abandoned which is a continuation-in-part of U.S. 
patent application Ser. No. 07/613,066, filed Nov. 14, 1990 
now abandoned. 


FIELD OF THE INVENTION 


The present invention relates generally to canine coro- 
navirus infections, and specifically to proteins useful in 
prophylaxis, therapy, and diagnosis of these infections in 
canines. 


BACKGROUND OF THE INVENTION 


The coronaviruses are a large family of mammalian and 
avian pathogens which were first described in 1968. They 
are the causative agents of several diseases including 
encephalitis, hepatitis, peritonitis and gastroenteritis. Enteric 
coronaviruses have been detected in the feces of man, pigs, 
calves, cats, mice, chickens and dogs. 


Canine coronavirus (CCV) enteritis was first isolated 
from dogs suffering an acute gastroenteritis, as reported by 
Binn et al., Proc. 78th Ann. Mtg. U.S. Animal Health Assoc., 
Roanoke Va., pp. 359-366 (1974). The disease became 
prevalent during the 1970s. CCV gastroenteritis appears to 
be primarily transmitted through fecal contamination from 
infected dogs via the oral route, leading ultimately to rep- 
lication of the virus in the epithelial cells of the small 
intestine. Virus can be recovered from the feces of an 
infected dog between 3 and 14 days after infection. 


CCV gastroenteritis is characterized by a mild depression, 
anorexia and loose stool from which the dog usually recov- 
ers. The onset of the disease is often sudden, accompanied 
by such symptoms as diarrhea, vomiting, excreted blood in 
stools, and dehydration. Deaths have occurred within as 
little as 24 to 36 hours after onset of clinical signs. Most 
dogs appear afebrile but elevated body temperature is seen 
in some cases. Often CCV will occur with a canine parvovi- 
rus infection and this coinfection can be fatal. 

Serologically the disease is closely related to transmis- 
sible gastroenteritis virus of swine (TGEV). Although 
canine coronavirus does not infect pigs, transmissible gas- 
troenteritis virus produces a subclinical infection in dogs. 
However, unlike the feline infectious peritonitis coronavirus 
(FIPV), previous exposure to CCV does not predispose dogs 
to enhanced disease; and antigen-antibody complexes, if 
formed, are not associated with disease pathology. 

There remains a need in the art for compositions useful in 
diagnosing, treating and preventing infections with canine 
coronaviruses. 


SUMMARY OF THE INVENTION 


In one aspect the present invention provides the complete 
nucleotide sequence of the CCV S gene, strain 1-71, SEQ ID 
NO:1. The S gene or fragments thereof may be useful in 
diagnostic compositions for CCV infection. 
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In another aspect the present invention provides a CCV S 
(or spike) protein characterized by the amino acid sequence 
of a CCV S protein, SEQ ID NO:2, and peptide fragments 
thereof. These proteins may be optionally fused or linked to 
other fusion proteins or molecules. 


Thus, in another aspect, the present invention provides a 
vaccine composition containing an effective immunogenic 
amount of at least one CCV S protein or an immunogenic 
fragment thereof. 


In still another aspect, the invention provides a method of 
vaccinating an animal against infection with a coronavirus 
by administering an effective amount of a vaccine compo- 
sition of this invention. 


In yet a further aspect, the present invention provides a 
pharmaceutical composition for the treatment of CCV infec- 
tion comprising a therapeutically effective amount of a CCV 
S peptide or protein of the invention and a pharmaceutically 
effective carrier. 


Still another aspect of this invention is an antibody 
directed to CCV, which antibody is capable of distinguishing 
between CCV and other canine viruses. These antibodies 
may also be employed as diagnostic or therapeutic reagents. 


In yet another aspect, a diagnostic reagent of the present 
invention comprises a CCV S protein or fragment thereof. In 
another aspect, the present invention provides a diagnostic 
reagent which comprises a nucleotide sequence which 
encodes a CCV S protein or fragment of the invention, 
and/or a nucleotide sequence which flanks the coding 
region, or fragments thereof. These protein and nucleotide 
sequences are optionally associated with detectable labels. 
Such diagnostic reagents may be used to assay for the 
presence of CCV in dogs using standard assay formats and 
can form components of a diagnostic kit. 


In a further aspect, the invention provides a method of 
using a diagnostic reagent of this invention to identify dogs 
which are uninfected or which have been previously 
exposed to CCV. The diagnostic method can differentiate 
exposure to CCV from exposure to other related 
coronaviruses, allow the identification of dogs which have 
been vaccinated against these diseases, and allow one to 
distinguish between different strains of CCV, or to identify 
dogs at advanced stages of CCV infection. 

In yet a further aspect, the invention provides a method 
for the production of a recombinant CCV protein comprising 
culturing a selected host cell, e.g., a mammalian cell or viral 
vector, transformed with a DNA sequence encoding a 
selected CCV S protein or fragment thereof in operative 
association with regulatory sequences capable of regulating 
the expression of said protein. 

Another aspect of the invention is a recombinant DNA 
molecule comprising a DNA sequence coding for a selected 
portion of a canine coronavirus S protein, the DNA 
sequences in operative association with regulatory 
sequences capable of directing the expression thereof in host 
cells. 

Other aspects and advantages of the present invention are 
described further in the following detailed description of the 
preferred embodiments thereof. 


DETAILED DESCRIPTION OF THE 
INVENTION 


The present invention provides novel isolated canine 
coronavirus (CCV) S proteins and fragments thereof, as well 
as isolated nucleotide sequences encoding the proteins or 
fragments. These proteins and fragments are useful for 
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diagnostic, vaccinal and therapeutic compositions as well as 
methods for using these compositions in the diagnosis, 
prophylaxis and treatment of CCV-related and other 
coronavirus-related conditions. 

I. Definitions 

As defined herein, an amino acid fragment is any amino 
acid sequence from at least about 8 amino acids in length up 
to about the full-length CCV S gene protein. A nucleotide 
fragment defines a nucleotide sequence which encodes from 
at least about 8 amino acids in length up to about the 
full-length CCV S gene protein. 

The term “region” refers to all or a portion of a gene or 
protein, which may contain one or more fragments as 
defined above. 

The term “immunogenic” refers to any S gene protein or 
fragment thereof, any molecule, protein, peptide, 
carbohydrate, virus, region or portion thereof which is 
capable of eliciting a protective immune response in a host, 
e.g., an animal, into which it is introduced. 

The term “antigenic” refers only to the ability of a 
molecule, protein, peptide, carbohydrate, virus, region or 
portion thereof to elicit antibody formation in a host (not 
necessarily protective). 

As used herein, the term “epitope” refers to a region of a 
protein which is involved in its immunogenicity, and can 
include regions which induce B cell and/or T cell responses. 

As used herein, the term “B cell site or T cell site” defines 
a region of the protein which is a site for B cell or T cell 
binding. Preferably this term refers to sites which are 
involved in the immunogenicity of the protein. 

II. Sources of CCV Sequences 

The examples below specifically refer to newly identified 
spike gene sequences from canine coronavirus (CCV) strain 
1-71. This strain is deposited with the American Type 
Culture Collection (ATCC), 12301 Parklawn Drive, 
Rockville, Md. under Accession No. VR-809. Particularly 
disclosed are nucleotide and amino acid sequences, SEQ ID 
NO:1 and 2, respectively, of the CCV S gene. 

The present invention is not limited to the particular CCV 
strain employed in the examples. Other CCV strains have 
been described, e.g., strain CCV-TN449 [ATCC 2068]. 
Utilizing the teachings of this invention, analogous frag- 
ments of other canine coronavirus strains can be identified 
and used in the compositions of this invention. 

II. CCV Nucleotide and Amino Acid Sequences of the 
Invention 

The inventors have identified and selected nucleotide and 
protein sequences of CCV strain 1-71 which have been 
determined to be of interest for use as vaccinal, therapeutic 
and/or diagnostic compositions. For example, selected pep- 
tide and nucleotide sequences present primarily in the vari- 
able N terminal region of the CCV S protein and gene are 
characterized by representing areas of homology between 
FIPV, TGEV, feline enteric coronavirus (FECV) and other 
coronavirus strains. 

Peptide fragments obtained from this heterogeneous N 
terminal of the S protein are useful fragments for diagnostic 
compositions and kits for distinguishing between infection 
with CCV strain 1-71 from other CCV infections, and for 
distinguishing between infection with CCV and other coro- 
navirus identified above in a vaccinated or infected dog, as 
well as for use in vaccine and therapeutic agents. 

Additionally, the amino terminal sequences of CCV S 
protein include peptide sequences which are B cell sites and 
thus useful in vaccinal or therapeutic compositions, or for 
generating antibodies to CCV, in assays for the detection of 
CCV antibodies in dogs. 


10 


15 


20 


25 


30 


35 


40 


45 


50 


60 


65 


4 


In addition, certain peptide fragments of the CCV S 
protein are believed to represent T cell sites, and thus are 
useful in vaccinal or therapeutic compositions. 

Other suitable CCV amino acid regions for pharmaceu- 
tical or diagnostic use are located within other regions of the 
CCV S protein SEQ ID NO: 2. These amino acid and 
nucleotide fragments of the CCV S protein and its nucleotide 
sequence discussed above are specifically reported below in 
Tables I and II. Table II also reports the respective homolo- 
gies of certain of these desired fragments to wild-type FIPV, 
1e., FIPV WSU 1146. The CCV S nucleotide fragments in 
Tables I and II can be useful for diagnostic probes, PCR 
primers, or for use in recombinant production of relevant S 
protein fragments for use in therapeutic or vaccinal compo- 
sitions. Other suitable fragments may also be identified for 
such use. 


TABLE I 


CCV Amino Acids 


B cell sites T cell sites SEQ ID NOS: 
50-250 3 
375-425 4 
450-470 5 
550-600 6 
650-700 7 
770-850 8 
900-1025 9 
1150-1225 10 
1250-1452 aa 
40-47 12 
63-81 43 
187-191 14 
241-274 15 
335-341 16 
395-428 17 
468-494 18 
846-860 19 
916-952 20 
977-992 21 
1068-1145 22 
1366-1391 23 
TABLE II 
Amino Acid Sequences 
CCV 1-71 % Homology CCV 1-71 SEQ ID NOS. 
Amino Acid Nucleotides to WT FIPV WSU 1146 AA Nucl. 
1113-1236 3337-3708 100 25 and 24 
540-599 1618-1797 93.3 27 and 26 
342-388 1024-1164 93.6 29 and 28 
137-153 409-459 64.7 31 and 30 
375-388 1123-1164 85.7 33 and 32 
1424-1440 4270-4320 94.1 35 and 34 
1407-1420 4219-4260 85.7 37 and 36 
1342-1406 4024-4218 96.9 39 and 38 
398-652 1192-1956 93.3 41 and 40 
128-555 382-1665 89.5 43 and 42 
447-628 1339-1884 91.8 45 and 44 





IV. Modified Sequences of the Invention 

In addition to the amino acid sequences and correspond- 
ing nucleotide sequences of the specifically-recited embodi- 
ments of CCV S proteins of this invention, the invention also 
encompasses other DNA and amino acid sequences of CCV 
S proteins. Such other nucleic acid sequences include those 
sequences capable of hybridizing to SEQ ID NO: 1 under 
conditions of at least 85% stringency, i.e. having at least 
85% homology to the sequence of SEQ ID NO: 1, more 
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preferably at least 90% homology, and most preferably at 
least 95% homology. Such homologous sequences are char- 
acterized by encoding a CCV S gene protein related to strain 
1-71. 

Further, allelic variations (naturally-occurring base 
changes in the species population which may or may not 
result in an amino acid change) of DNA sequences encoding 
the various S amino acid or DNA sequences from the 
illustrated CCV are also included in the present invention, as 
well as analogs or derivatives thereof. Similarly, DNA 
sequences which code for protein sequences of the invention 
but which differ in codon sequence due to the degeneracies 
of the genetic code or variations in the DNA sequence 
encoding these proteins which are caused by point mutations 
or by induced modifications to enhance the activity, half-life 
or production of the peptide encoded thereby are also 
encompassed in the invention. 

Variations in the amino acid sequences of this invention 
may typically include analogs that differ by only 1 to about 
4 codon changes. Other examples of analogs include 
polypeptides with minor amino acid variations from the 
natural amino acid sequence of S gene proteins and/or the 
fusion partner; in particular, conservative amino acid 
replacements. Conservative replacements are those that take 
place within a family of amino acids that are related in their 
side chains. Genetically encoded amino acids are generally 
divided into four families: (1) acidic=aspartate, glutamate; 
(2) basic=lysine, arginine, histidine; (3) non-polar=alanine, 
valine, leucine, isoleucine, proline, phenylalanine, 
methionine, tryptophan; and (4) uncharged polar=glycine, 
asparagine, glutamine, cysteine, serine, threonine, tyrosine. 
Phenylalanine, tryptophan, and tyrosine are sometimes clas- 
sified jointly as aromatic amino acids. For example, it is 
reasonable to expect that an isolated replacement of a 
leucine with an isoleucine or valine, an aspartate with a 
glutamate, a threonine with a serine, or a similar conserva- 
tive replacement of an amino acid with a structurally related 
amino acid will not have a significant effect on its activity, 
especially if the replacement does not involve an amino acid 
at an epitope of the polypeptides of this invention. 

V. Fusion Proteins 

If desired, the CCV S proteins and peptide fragments, e.g. 
those identified in Tables I and II, can be produced in the 
form of fusion proteins as defined below. Such a fusion 
protein may contain either a full-length CCV S protein or an 
immunogenic fragment thereof. Suitable fragments include 
those contained within SEQ ID NO: 2 and the amino acids 
fragments of Tables I and II. Other suitable fragments can be 
determined by one of skill in the art by analogy to the 
sequences provided herein. 

Proteins or peptides may be selected to form fusion 
proteins with the selected S protein or peptide sequence 
based on a number of considerations. The fusion partner 
may be a preferred signal sequence, a sequence which is 
characterized by enhanced secretion in a selected host cell 
system, or a sequence which enhances the stability or 
presentation of the S-derived peptide. Such exemplary 
fusion partners include, without limitation, ubiquitin and a 
mating factor for yeast expression systems, and beta- 
galactosidase and influenza NS-1 protein for bacterial sys- 
tems. One of skill in the art can readily select an appropriate 
fusion partner for a selected expression system. The present 
invention is not limited to the use of any particular fusion 
partner. 

The CCV S protein or fragments thereof can optionally be 
fused to each other or to the fusion partner through a 
conventional linker sequence, 1.e., containing about 2 to 50 
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amino acids, and more preferably, about 2 to about 20 amino 
acids in length. This optional linker may provide space 
between the two linked sequences. Alternatively, this linker 
sequence may encode, if desired, a polypeptide which is 
selectively cleavable or digestible by conventional chemical 
or enzymatic methods. For example, the selected cleavage 
site may be an enzymatic cleavage site, including sites for 
cleavage by a proteolytic enzyme, such as enterokinase, 
factor Xa, trypsin, collagenase and thrombin. Alternatively, 
the cleavage site in the linker may be a site capable of being 
cleaved upon exposure to a selected chemical, e.g., cyano- 
gen bromide or hydroxylamine. The cleavage site, if inserted 
into a linker useful in the fused sequences of this invention, 
does not limit this invention. Any desired cleavage site, of 
which many are known in the art, may be used for this 
purpose. 

VI. Production of Sequences of Invention 

The CCV S gene protein of the invention and amino acid 
regions, fragments thereof and their corresponding nucle- 
otide sequences, as well as other proteins described herein, 
e.g. fusion partners, may be produced by conventional 
methods. These proteins or fragments and the nucleotide 
sequences may be prepared by chemical synthesis tech- 
niques [Merrifield, J.A.C.S., 85:2149-2154 (1963)]. 
Preferably, however, they are prepared by known recombi- 
nant DNA techniques by cloning and expressing within a 
host microorganism or cell a DNA fragment carrying a 
coding sequence for the selected protein. See, e.g., Sam- 
brook et al, “Molecular Cloning. A Laboratory Manual”, 2nd 
edit., Cold Spring Harbor Laboratory, New York (1989). 
Such techniques are discussed below in the Examples. 

According to cloning techniques, a selected gene frag- 
ment of this invention can be cloned into a selected expres- 
sion vector. Vectors for use in the method of producing S 
protein proteins comprise a novel S gene DNA sequence (or 
a fragment thereof) of the invention and selected regulatory 
sequences in operative association with the DNA coding 
sequence, and capable of directing the replication and 
expression of the peptide in a selected host cell. 

Vectors, e.g., polynucleotide molecules, of the invention 
may be designed for expression of CCV S proteins and/or 
fusion proteins in bacterial, mammalian, fungal or insect 
cells or in selected viruses. Suitable vectors are known to 
one skilled in the art by resort to known publications or 
suppliers. 

The resulting DNA molecules or vectors containing 
nucleotide sequences encoding the canine coronavirus S 
peptides or fragments thereof and/or encoding the fusion 
proteins are then introduced into host cells and expression of 
the heterologous protein induced. 

Additional expression systems may include the known 
viral expression systems, e.g., vaccinia, fowlpox, swine pox. 
It is understood additionally, that the design of the expres- 
sion vector will depend on the choice of host cell. A variety 
of suitable expression systems in any of the below-identified 
host cells are known to those skilled in the art and may be 
readily selected without undue effort. 

Suitable cells or cell lines for use in expressing the S 
protein or peptides of this invention can be eukaryotic or 
prokaryotic. A preferred expression system includes mam- 
malian cells, such as Chinese Hamster ovary cells (CHO) or 
COS-1 cells. The selection of other suitable mammalian host 
cells and methods for transformation, culture, amplification, 
screening and product production and purification are 
known in the art. See, e.g., Gething and Sambrook, Nature, 
293:620-625 (1981), or alternatively, Kaufman et al, Mol. 
Cell. Biol., 5(7) :1750-1759 (1985) or Howley et al, U.S. 
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Pat. No. 4,419,446. Also desirable are insect cell systems, 
such as the baculovirus or Drosophila systems. The selection 
of other suitable host cells and methods for transformation, 
culture, amplification, screening and product production and 
purification can be performed by one of skill in the art by 
reference to known techniques. See, e.g., Gething and 
Sambrook, Nature, 293:620-625 (1981). 

After the transformed host cells are conventionally cul- 
tured for suitable times and under suitable culture conditions 
known to those skilled in the art, the cells may be lysed. It 
may also be possible, depending on the construct employed, 
that the recombinant proteins are secreted extracellularly 
and obtained from the culture medium. Cell lysates or 
culture medium are then screened for the presence of CCV 
S protein or peptide which are recognized by antibodies, 
preferably monoclonal antibodies (MAbs), to a peptide 
antigenic site from CCV. 

Similarly, the fusion proteins may be produced by resort 
to chemical synthesis techniques, or preferably, recombinant 
methods, as described above. The selected primer sets used 
in the PCR reaction described in the Examples below may 
be designed to produce PCR amplified fragments containing 
restriction endonuclease cleavage site sequences for intro- 
duction of a canine coronavirus s gene fragment in a specific 
orientation into a selected expression vector to produce 
fusion proteins of the invention. The vector may contain a 
desired protein or fragment thereof to which the S gene 
fragment is fused in frame to produce a fusion protein. 

The crude cell lysates containing the CCV S protein or 
peptides or fusion proteins can be used directly as vaccinal 
components, therapeutic compositions or diagnostic 
reagents. Alternatively, the CCV S peptides can be purified 
from the crude lysate or medium by conventional means. 
VII. Vaccine Compositions 

The CCV S proteins and immunogenic fragments of this 
invention may be incorporated in a vaccine composition. 
Such a vaccine composition may contain an immunogenic 
amount of one or more selected CCV S peptides or proteins, 
e.g., encoded by the complete S gene sequence of CCV or 
partial sequences thereof, and prepared according to the 
method of the present invention, together with a carrier 
suitable for administration as a vaccine composition for 
prophylactic treatment of CCV infections. The protein may 
be in the form of a fusion protein as above-described. 
Alternatively, the CCV S gene or fragment may be incor- 
porated into a live vector, e.g., adenovirus, vaccinia virus 
and the like. The expression of vaccinal proteins in such live 
vectors are well-known to those in the art [See, e.g., US. 
Pat. No. 4,920,209]. It is preferable that the protein 
employed in the vaccine composition induces protective 
immune responses against more than one strain of CCV. 

A vaccine composition according to the invention may 
optionally contain other immunogenic components. Particu- 
larly desirable are vaccine compositions containing other 
canine antigens, e.g., canine distemper, Borrelia 
burgdorferi, canine Bordetella, rabies, canine parvovirus, 
Leptosporidia sp., canine rotavirus, canine parainfluenza 
virus and canine adenovirus. 

In another embodiment, the cCv S proteins may be used 
in a combination vaccine directed to related coronaviruses. 
Other suitable coronaviruses which can be used in such a 
combination vaccine include a feline coronavirus, such as 
FIPV or FECV. For example, a CCV S peptide or protein of 
the present invention may be employed as an additional 
antigen in the temperature sensitive FIPV vaccine described 
in detail in co-owned, co-pending U.S. patent application 
Ser. No. 07/428,796 filed Oct. 30, 1989, incorporated by 
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reference herein. Alternatively, the CCV S protein or peptide 
or a fragment thereof could be used in a vaccine composition 
containing other coronavirus S proteins or fragments 
thereof, particularly those described in co-pending, 
co-owned U.S. patent application Ser. No. 07/698,927 (and 
its corresponding published PCT Application No. W092/ 
08487). 

The preparation of a pharmaceutically acceptable vaccine 
composition, having appropriate pH isotonicity, stability and 
other conventional characteristics is within the skill of the 
art. Thus such vaccines may optimally contain other con- 
ventional components, such as adjuvants and/or carriers, e.g. 
aqueous suspensions of aluminum and magnesium 
hydroxides, liposomes and the like. 

The vaccine composition may be employed to vaccinate 
animals against the clinical symptoms associated with CCV. 
The vaccines according to the present invention can be 
administered by an appropriate route, e.g., by the oral, 
intranasal, subcutaneous, intraperitoneal or intramuscular 
routes. The presently preferred methods of administration 
are the subcutaneous and intranasal routes. 

The amount of the CCV S peptide or protein of the 
invention present in each vaccine dose is selected with 
regard to consideration of the animal’s age, weight, sex, 
general physical condition and the like. The amount required 
to induce an immunoprotective response in the animal 
without significant adverse side effects may vary depending 
upon the recombinant protein employed as immunogen and 
the optional presence of an adjuvant. Generally, it is 
expected that each dose will comprise between about 
0.05-5000 micrograms of protein per mL, and preferably 
0.05-100 micrograms per mL of a sterile solution of an 
immunogenic amount of a protein or peptide of this inven- 
tion. Initial doses may be optionally followed by repeated 
boosts, where desirable. 

Another vaccine agent of the present invention is an 
anti-sense RNA sequence generated to the S gene of CCV 
strain 1-71 [SEQ ID NO:1] [S. T. Crooke et al, Biotech., 
10:882-886 (August 1992)]. This sequence may easily be 
generated by one of skill in the art either synthetically or 
recombinantly. Under appropriate delivery, such an anti- 
sense RNA sequence when administered to an infected 
animal should be capable of binding to the RNA of the virus, 
thereby preventing viral replication in the cell. 

VIII. Pharmaceutical Compositions 

The invention also provides a pharmaceutical composi- 
tion comprising one or more CCV S peptides or proteins 
prepared according to the present invention and a pharma- 
ceutically effective carrier. Suitable pharmaceutically effec- 
tive carriers for internal administration are known to those 
skilled in the art. One selected carrier is sterile saline. The 
pharmaceutical composition can be adapted for administra- 
tion by any appropriate route, but is designed preferentially 
for administration by injection or intranasal administration. 
IX. Antibodies of the Invention 

The present invention also encompasses the development 
of an antibody to one or more epitopes in the above 
identified amino acid sequences derived from the CCV S 
protein, which epitope is distinct from those of other CCV 
strains or other coronaviruses, e.g. FIPV, TGEV or FECV. 
The antibody can be developed employing as an antigenic 
substance, a peptide of Table I or II. Alternatively, other 
regions of the CCV strain 1-71 S protein SEQ ID NO: 2 may 
be employed in the development of an antibody according to 
conventional techniques. 

In one embodiment, the antibody is capable of identifying 
or binding to a CCV antigenic site encoded by SEQ ID NO: 
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1 or a fragment thereof. Such an antibody may be used in a 
diagnostic screening test, e.g., as a hybridization probe, or as 
a therapeutic agent. 

Antibodies which bind CCV peptides from the regions 
identified above or to other regions capable of distinguishing 
between CCV, TGEV, FIPV, FECV, and other coronaviruses 
for use in the assays of this invention may be polyclonal. 
However, it is desirable for purposes of increased target 
specificity to utilize MAbs, both in the assays of this 
invention and as potential therapeutic and prophylactic 
agents. Additionally, synthetically designed MAbs may be 
made by known genetic engineering techniques [W. D. Huse 
et al, Science, 2:1275-1281 (1989)] and employed in the 
methods described herein. For purposes of simplicity the 
term MAb(s) will be used throughout this specification; 
however, it should be understood that certain polyclonal 
antibodies, particularly high titer polyclonal antibodies and 
recombinant antibodies, may also be employed. 

A MAb may be generated by the well-known Kohler and 
Milstein techniques and modifications thereof and directed 
to one or more of the amino acid residue regions identified 
above, or to other CCV S peptides or epitopes containing 
differences between CCV strain 1-71 and other coronavi- 
ruses. For example, a fragment of SEQ ID NO: 2 which 
represents an antigenic site, which differs from that of FIPV, 
may be presented as an antigen in conventional techniques 
for developing MAbs. One of skill in the art may generate 
any number of MAbs by using fragments of the amino acid 
residue regions identified herein as an immunogen and 
employing these teachings. 

For diagnostic purposes, the antibodies (as well as the 
diagnostic probes) may be associated with individual labels. 
Where more than one antibody is employed in a diagnostic 
method, the labels are desirably interactive to produce a 
detectable signal. Most desirably, the label is detectable 
visually,, e.g. calorimetrically. Detectable labels for attach- 
ment to antibodies useful in the diagnostic assays of this 
invention may also be easily selected by one skilled in the 
art of diagnostic assays, amount which include, without 
limitation, horseradish peroxidase (HRP) or alkaline phos- 
phatase (AP), hexokinase in conjunction with glucose-6- 
phosphate dehydrogenase, and NAD oxidoreductase with 
luciferase and substrates NADH and FMN or peroxidase 
with luminol and substrate peroxide. These and other appro- 
priate label systems and methods for coupling them to 
antibodies or peptides are known to those of skill in the art. 

Antibodies may also be used therapeutically as targeting 
agents to deliver virus-toxic or infected cell-toxic agents to 
infected cells. Rather than being associated with labels for 
diagnostic uses, a therapeutic agent employs the antibody 
linked to an agent or ligand capable of disabling the repli- 
cating mechanism of the virus or of destroying the virally- 
infected cell. The identity of the toxic ligand does not limit 
the present invention. It is expected that preferred antibodies 
to peptides encoded by the S genes identified herein may be 
screened for the ability to internalize into the infected cell 
and deliver the ligand into the cell. 

X. Diagnostic Reagents and Assays 

The nucleotide sequences, amino acid fragments and 
antibodies described above may be employed as diagnostic 
reagents for use in a variety of diagnostic methods according 
to this invention. 

A. ECR Diagnostic Assays 

For example, these sequences can be utilized in a diag- 
nostic method employing the polymerase chain reaction 
(PCR) technique to identify the presence of a CCV or 
CCV-like virus and in therapy of infected animals. 
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In addition to those sequences identified above, the oli- 
gonucleotide sequences that were designed to prime cDNA 
synthesis at specific sites within the CCV S gene, as 
described in detail below in Example 3 [SEQ ID NO:46-50], 
may also be employed as diagnostic reagents according to 
this invention. These sequences, as well as the below- 
described optimized conditions for the PCR amplification of 
CCV fragments therefrom, may also be employed in a 
diagnostic method. 

The PCR technique is known to those of skill in the art of 
genetic engineering and is described in detail in Example 4 
[see, e.g., R. K. Saiki et al, Science, 230:1350-1354 (1985)], 
which is incorporated herein by reference. Briefly described, 
PCR employs two oligonucleotide primers which are 
complementary to the opposite strands of a double stranded 
nucleic acid of interest whose strands are oriented such that 
when they are extended by DNA polymerase, synthesis 
occurs across the region which separates the oligonucle- 
otides. By repeated cycles of heat denaturation, annealing of 
the primers to their complementary sequences and extension 
of the annealed primers with a temperature stable DNA 
polymerase, millions of copies of the target gene sequence 
are generated. The template for the reaction is total RNA, 
which is isolated from CCV infected cells. DNA fragments 
generated by PCR were amplified from cDNA which had 
been synthesized from this RNA. Other strains of CCV or 
CCV-related sequences may also provide PCR templates in 
a similar manner. 

In one diagnostic method, for example, heterogenous 
CCV gene sequences of this invention are useful as reagents 
in diagnostic assays to detect and distinguish the presence of 
specific viruses from each other, e.g., to distinguish one 
canine coronavirus strain from another or one species of 
coronavirus from another by means of conventional assay 
formats. For example, using protocols similar to those used 
for forensic purposes, tissue or blood samples from a dog 
suspected to be infected with CCV would be subjected to 
PCR amplification with a selected CCV-specific set of 
primers, such as those DNA sequences disclosed herein. 
Amplification of DNA from a sample tissue or biological 
fluid of the animal suspected of infection using nucleotide 
sequences as primers specific for regions of the CCV viral 
gene sequences could correlate to the presence of CCV. 
Absence of CCV in the sample would result in no amplifi- 
cation. Similarly, the selection of specific sets of S gene 
primers would allow the identification of a particular strain 
of CCV as well. Thus, appropriate treatments may be 
selected for the infected animal. 

Example 3 provides oligonucleotide primers which per- 
mitted the synthesis of regions of the CCV S gene. The 
nucleotide sequence of the S gene of CCV provides desir- 
able sequences for hybridization probes and PCR primers, 
for example, the sequences between nucleotide base pairs 
900 to about 1600 [SEQ ID NO: 55] and about 2500 to about 
3900 [SEQ ID NO: 56] of SEQ ID NO: 1. Smaller or larger 
DNA fragments in these regions may also be employed as 
PCR primers or hybridization probes. 

It is desirable to have PCR primer sequences between 15 
to 30 bases in length, with an intervening sequence of at least 
100 bases to as large as 5000 bases there between, according 
to conventional PCR technology. However, it is possible that 
larger or smaller sequence lengths may be useful based upon 
modifications to the PCR technology. In general, in order to 
achieve satisfactory discrimination, a hybridization or oli- 
gonucleotide probe made up of one or more of these 
sequences would consist of between 15 and 50 bases in 
length based on current technology. 
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B. Conventional Assay Formats 

The CCV S proteins or peptide fragments may also be 
employed in standard diagnostic assays which rely on S 
protein immunogens as targets for sera recognition. The 
diagnostic assays may be any conventionally employed 
assay, e.g., a sandwich ELISA assay, a Western blot, a 
Southern blot and the like. Because a wide variety of 
diagnostic methods exist and are conventionally known 
which can be adapted to the use of the nucleotide and amino 
acid sequences described herein, it should be understood 
that the nature of the diagnostic assay does not limit the use 
of the sequences of this invention. 

For example, the amino acid sequences encoded by CCV 
S gene sequences, such as those appearing in Tables I and II 
above, which may be amplified by PCR, provide peptides 
useful in such diagnostic assays as ELISA or Western assay, 
or as antigens for the screening of sera or development of 
antibodies. 

For example, the sequences between about amino acid 1 
to about 250 [SEQ ID NO:57], about 450 to about 650 [SEQ 
ID NO:58], and about 900 to about 1150 [SEQ ID NO:59] of 
the CCV strain 1-71 S gene protein SEQ ID NO:2, are 
anticipated to be useful as such antigens. Such peptides can 
optionally also be used in the design of synthetic peptide 
coupled to a carrier for diagnostic uses, e.g., antibody 
detection in sera. Suitable carriers include ovalbumin, key- 
hole limpet hemocyanin, bovine serum albumin, sepharose 
beads and polydextran beads. 

Such peptide antigens and antibodies to these peptides 
would react positively with tissue or serum samples of dogs 
infected with CCV, but negatively with non-CCV infected 
dogs. These antibodies are discussed in more detail below. 

For example, the invention provides a method of using the 
full length CCV S protein or fragments thereof as diagnostic 
agents for identifying the presence or absence of antibodies 
in previously exposed, naive or vaccinated dogs, 
respectively, as well as for differentiating exposure to CCV 
from other related coronaviruses. Other S peptides or fusion 
proteins which show differential reactivity to CCV and other 
coronavirus sera may also be useful as CCV-specific 
reagents in ELISA-based screening assays to detect CCV 
exposure in dogs. Similarly, an S protein or peptide which 
contains epitopes recognized only by sera from CCV 
infected dogs or by sera from CCV positive dogs could be 
employed to distinguish or differentiate among coronavirus 
infections. 

As one assay format, the reactivity of affinity purified 
CCV S proteins or peptides fragments to canine biological 
fluids or cells can be assayed by Western blot. The assay is 
preferably employed on sera, but may also be adapted to be 
performed on other appropriate fluids or cells, for example, 
macrophages or white blood cells. In the Western blot 
technique, the purified protein, separated by a preparative 
SDS polyacrylamide gel, is transferred to nitrocellulose and 
cut into multiple strips. The strips are then probed with dog 
sera from uninfected or infected dogs. Binding of the dog 
sera to the protein is detected by incubation with alkaline 
phosphatase tagged goat anti-dog IgG followed by the 
enzyme substrate BCIP/NBT. Color development is stopped 
by washing the strip in water. 

CCV S protein or fragments thereof may also be used in 
an ELISA based assay for detecting CCV disease. A typical 
ELISA protocol would involve the adherence of antigen 
(e.g., a S protein) to the well of a 96-well tray. The serum to 
be tested is then added. If the serum contains antibody to the 
antigen, it will bind. Specificity of the reaction is determined 
by the antigen absorbed to the plate. With the S protein, only 
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sera from those dogs infected with CCV would bind to the 
plate; sera from naive or uninfected dogs would not bind. 

Similarly, a CCV S protein or peptide which contained 
epitopes recognized only by sera from CCV-infected dogs or 
by sera from CCV-positive dogs could be employed to 
distinguish coronavirus infections. After the primary anti- 
body is bound, an enzyme-labeled antibody directed against 
the globulin of the animal whose serum is tested is added. 
Substrate is then added. The enzyme linked to antibody 
bound to the well will convert the substrate to a visible form. 
The amount of color measured is proportional to the amount 
of antibody in the test material. In this manner, dogs infected 
with CCV can be identified and treated, or dogs naive to the 
virus can be protected by vaccination. 

When used as diagnostic reagents, the primers, probes, 
peptide antigens, nucleotide sequence encoding or flanking 
a CCV S protein or fragment of the invention, and antibodies 
of this invention may be optionally associated with detect- 
able labels or label systems known to those skilled in the art. 
Such labelled diagnostic reagents may be used to assay for 
the presence of CCV in dogs in hybridization assays or in the 
PCR technique as described above. 

C. Diagnostic Kits 

The assay methods, PCR primers, CCV S_ nucleotide 
sequences [SEQ ID NO:1], S proteins and peptides, and 
antibodies described herein may be efficiently utilized in the 
assembly of a diagnostic kit, which may be used by veteri- 
narians or laboratories. The kit is useful in distinguishing 
between CCV infected animals and vaccinated animals, as 
well as non-exposed dogs, and between CCV-infected ani- 
mals and animals infected with serologically related viruses, 
such as other CCV or FIPV, TGEV, and FECV. Such a 
diagnostic kit contains the components necessary to practice 
the assays described above. 

Thus, the kit may contain a sufficient amount of at least 
one CCV S protein, fusion protein or peptide fragment, at 
least one CCV S gene nucleotide sequence or PCR primer 
pair of this invention, a MAb directed to a first epitope on 
the CCV S protein (which MAb may be labeled), optional 
additional components of a detectable labelling system, vials 
for containing the serum samples, protein samples and the 
like, and a second MAb conjugated to the second enzyme, 
which in proximity to the first enzyme, produces a visible 
product. Other conventional components of such diagnostic 
kits may also be included. 

Alternatively, a kit may contain a selected CCV S protein 
or peptide, a MAb directed against a selected CCV S peptide 
fragment bound to a solid surface and associated with a first 
enzyme, a different MAb associated with a second enzyme, 
and a sufficient amount of the substrate for the first enzyme, 
which, when added to the serum and MAbs, provides the 
reactant for the second enzyme, resulting in the color 
change. 

Other known assay formats will indicate the inclusion of 
additional components for a diagnostic kit according to this 
invention. 

The following examples illustrate the embodiments of this 
invention and do not limit the scope of the present invention. 


EXAMPLE 1 
Isolation of CCV 


Canine coronavirus strain 1-71 was isolated in 1971 from 
military dogs suffering from a viral gastroenteritis by Binn 
et al., Proceeding 78th Annual Meeting U.S. Animal Health 
Association, October 1974, p. 359-366. The initial isolate 
from the feces of the infected dog was grown in tissue 
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culture on the PrDKTCA72 dog cell line [ATCC No. CRL 
1542]. The coronavirus strain used in this study was 
received from the ATCC (ATCC #VR-809, CCV Strain 
1-71, Frozen lot#4, Passage 7/PDK, 17 May 1988) and 
passaged five times on PrDKTCA72. 


EXAMPLE 2 


RNA Purification 


After the fifth passage the infected cells were processed 
for RNA isolation by infecting a 1700 cc? roller bottle with 
a CCV inoculum. The inoculum was prepared by diluting 
2.5 zl of infected fluids from a confluent monolayer into 13.0 
mls of media. One ml of this material was used to infect a 
roller bottle and the cells were grown until they demon- 
strated a pronounced cytopathic effect at 48 hours. The 
infected monolayers were harvested and total cytoplasmic 
RNA was extracted using the guanidinium thiocyanate pro- 
cedure as described in Chirgwin et al., Biochem., 18:5294 
(1979). 


EXAMPLE 3 


Primers Used for PCR Amplification of CCV 
SDike Gene Fragments 


The primers appearing below in Table III were synthe- 
sized conventionally by the phosphoramidite method and gel 
purified prior to use. Primer #3045 was based on an FECV 
S gene sequence; and primers #4920, 1923, 2443 and 2600 
were based on WT FIPV WSU 1146 sequences. 
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Research Labs, Gaithersburg, Md.] and 1.0 ug of respective 
RNA isolated as described above in Example 3. To avoid 
pipetting errors and contamination, all solutions were ali- 
quoted from master mixes made with diethyl pyrocarbonate 
(DEPC) treated water and consisted of all of the reaction 
components except the RNA which was added last. 

The mixture was incubated in a programmable thermal 
cycler [Perkin-Elmer Cetus, Norwalk, Conn.] at 21° C. for 
ten minutes followed by 42° C. for one hour then 95° C. for 
five minutes and finally held at 4° C. until PCR amplifica- 
tion. 

Amplification of the cDNA was performed essentially 
according to the method of R. K. Saiki et al, Science, 
230:1350-1354 (1985) using the Taq polymerase. Briefly, to 
the 20 yl cDNA reaction mix from above was added 10.0 zal 
10xPCR buffer, 1.0 wl of each upstream and downstream 
primer previously diluted in water to 30 picomoles per 
microliter and 2.5 units of Taq polymerase (Perkin-Elmer 
Cetus, Norwalk, Conn.). Final volume was made up to 100 
ul using DEPC treated water and overlaid with 100 ul of 
mineral oil. As above, master mixes were prepared to avoid 
contamination. The reaction was performed in the Perkin- 
Elmer Cetus thermal cycler for one cycle by denaturing at 
95° C. for 1 minute, annealing at 37° C. for 3 minutes 
followed by an extension at 72° C. for 40 minutes. This 
initial cycle increased the likelihood of first strand DNA 
synthesis. A standard PCR profile was then performed by a 
95° C. for 1 minute denaturation, 37° C. for 3 minutes 
annealing, 72° C. for 3 minutes extension for 40 cycles. A 
final extension cycle was done by 95° C. for 1 minute 
denaturation, 37° C. for 2 minutes annealing, 72° C. for 15 
minutes extension and held at 4° C. until analyzed. 


TABLE III 
Amplified 
S Gene Region Cloned Region Top Primer Bottom Primer 
1-362 aa 1-352 aa # 3045 # 4920 
352-1452 aa 352-1452 aa # 2600 # 1923 
1-555 aa 128-555 aa # 3045 # 2443 
Primer # DNA Sequence 
1923 TAAATAGGCCTTTAGTGGACATGCACTTTTTCAATTGG 
[SEQ ID NO:46] Stul 
2443 TTAGTAGGCCTGTCGAGGCTATGGGTTGACCATAACCAC 
[SEQ ID NO:47] Stul 
2600 CAGATCCCGGGTGTACAATCTGGTATGGGTGCTACAG 
[SEQ ID NO:48] XmaI 
3045 GTGCCCCCGGGTATGATTGTGCTCGTAACTTGCCTCTTG 
[SEQ ID NO:49] XmaI 
4920 AGCACCCATACCAGATTGTACATCTGCAGTGAAATTAAGATTG 
[SEQ ID NOo:50] PstI 


EXAMPLE 4 


PCR Amplification of CCV S Gene 


PCR amplified fragments of CCV S gene were generated 
using the following procedure. All PCR reagents were 
supplied by Perkin Elmer-Cetus, Norwalk, Conn. In a final 
reaction volume of 20 ul of 1xRT buffer (5xRT buffer: 250 
mM Tris-HCl, pH 8.3, 375 mM KCl, 15 mM MgCl), the 
following components were assembled in RNAse-free sili- 
conized 500 ul microcentrifuge tubes: 1.0 mM of each 
dNTP, 20 units of RNAsin [Promega Corp, Madison, Wis.], 
2.5 picomoles of random hexamer oligonucleotides 
[Pharmacia, Milwaukee, Wis.], 100 picomoles/ul solution in 
TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 7.5), 200 
units of reverse transcriptase [Superscript RT, Bethesda 
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PCR products were analyzed by electrophoresing 5.0 ul of 
the reaction on a 1.2% agarose gel for 16-17 hours. Bands 
were visualized by ethidium bromide staining the gel and 
fluorescence by UV irradiation at 256 nm. Photography 
using Polaroid type 55 film provided a negative that could be 
digitized for sample distance migration and comparison 
against markers run on each gel. The actual sizes of the 
bands were then calculated using the Beckman Microgenie 
software running on an IBM AT. 


EXAMPLE 5 


Cloning of CCV Spike Gene Regions 


Cloning procedures were performed substantially as 
described by Maniatis et al, cited above. Details of the 
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clonings are provided in the following examples. Calf- 
alkaline phosphatase was from Bethesda Research Labs 
(Gaithersburg, Md.). Ligation products were transformed 
into E. coli host strain XL1 Blue [Stratagene Cloning 
Systems, La Jolla, Calif.]. pBluescript SK,,M13-phagemid 
vector was also obtained from Stratagene Cloning Systems. 
All restriction enzymes were purchased from New England 
Biolabs (Beverly, Mass.) or Bethesda Research Labs 
(Gaithersburg, Md.) and used according to manufacturer’s 
specifications. T4 DNA ligase was received from Boe- 
hringer Mannheim Biochemicals (Indianapolis, Ind.). Calf 
intestinal alkaline phosphatase was purchased from 
Bethesda Research Labs. 


EXAMPLE 6 


CCV S Protein Fragment, A.A. 1-128 [SEQ ID 
NO:51] 


Five microliters (approximately 200 ng) of PCR- 
amplified DNA representing amino acids 1-362 [SEQ ID 
NO:53] of the CCV spike gene were ligated to the pT7Blue 
T-Vector (Novagen, Madison, Wis.) as per the manufactur- 
er’s instructions. One microliter of the ligation mix was used 
to transform NovaBlue competent cells (Novagen) and 
transformation mixes were plated on LB plates supple- 
mented with ampicillin, isopropylthio-B-galactoside (IPTG; 
Sigma Chemical Co., St. Louis, Mo.), and 5-bromo-4- 
chloro-3-indolyl-B-D-galactoside (X-gal; Sigma Chemical 
Co., St. Louis, Mo.). White colonies were picked and 
screened by restriction analysis of mini-prep DNA. Insert- 
bearing clones were identified and oriented with respect to 
vector by Smal/PstI, Stul, and PstI digests. Clone #2964 
contained a full-length 1-362 amino acid insert and was 
used to provide sequence analysis from 1-128 amino acids 
of the CCV S gene. 


EXAMPLE 7 


CCV S Protein Fragment. A.A. 128-555 [SEQ ID 
NO: 43] 


10 ul of PCR DNA encoding 1-555aa of the CCV spike 
protein was digested with SmaI/Stul for 4 hours at room 
temperature. DNA bands were isolated and purified from 
low-melting temperature agarose gels as described by 
Maniatis et al, cited above. Briefly, DNA fragments were 
visualized after staining with ethidium bromide, excised 
from the gel with a scalpel and transferred to microfuge 
tubes. Gel slices were incubated 5 min at 65° C., vortexed, 
and 5 volumes of 20 mM Tris, pH 8.0, 1 mM EDTA were 
added. Samples were incubated an additional 2 minutes at 
65° C. and were then extracted once with phenol and again 
with phenol:chloroform. The DNA was precipitated with 
1/10 volume 3 M NaOAc, pH 7.0, and 2.5 volumes of cold 
95% EtOH overnight at -20° C. Insert DNAs were ligated 
to SK,EM13-SmalI-digested, dephosphorylated vector 
[Stratagene] for 4 hours at room temperature. Insert-bearing 
clones were identified by XholI/SstI and BqlI digests of 
mini-prep DNA. Restriction enzyme and sequence analysis 
indicated that the cloned insert was short by ~300 bp due to 
the presence of a Stul site at amino acid #128 of the CCV 
spike gene. Therefore, these clones contained the CCV S 
protein spanning amino acids from about 128-555 [SEQ ID 
NO:43]. 


EXAMPLE 8 


CCV S Protein Fragment. A.A. 352-1452 [SEQ ID 
NO:52] 


PCR-amplified DNA fragments encoding amino acids 
352-1454 of the CCV spike protein were purified using 
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Prime-Erase Quik Columns [Stratagene] according to the 
manufacturer’s instructions. Column-purified DNAs were 
then digested with XmalI/EcoRV overnight at 15° C. and 
subsequently isolated and eluted from low-melting tempera- 
ture agarose gels as described by Maniatis et al, cited above. 
Inserts were ligated overnight at 15° C. to SK,,.M13- Xmal/ 
Stul digested, dephosphorylated vector [Stratagene]. Clones 
were identified and oriented with respect to vector by 
Xhol/SstI and Pvull digests of mini-prep DNAs, respec- 
tively. 


EXAMPLE 9 


DNA Sequencing 


DNA sequence for the CCV S gene was determined from 
the individual clones #1775 (AA 352-1452; SEQ ID 
NO:52), #2007 (AA 128-555; SEQ ID NO:43) and #2964 
(AA 1-362; SEQ ID NO:53). Nested set deletions were 
prepared from each clone or internal primers synthesized to 
facilitate primer walking and the sequence determined from 
both strands [Lark Sequencing Technologies, Houston, 
Tex.]. The chain termination method performed as described 
in Sanger et al, Proc. Natl. Acad, Sci. USA, 74:5463-5467 
(1977) was used to determine the sequence of all clones. The 
full length sequence of the CCV S gene was assembled from 
overlapping sequences of each of the three separate frag- 
ments by computer analysis. 


DNA sequence analysis was performed using either Beck- 
man Microgenie programs on an IBM Model PS12 Model 
70 or the University of Wisconsin GCG package of pro- 
grams implemented on a DEC VAX cluster [Devereau et al., 
(1984)]. 


SEQ ID NO:1 is the complete nucleotide sequence of the 
CCV strain 1-71 S gene. The amino acid [SEQ ID NO:2] 
and nucleotide sequences (SEQ ID NO:1 of CCV 1-71 total 
1452 amino acids and 4356 base pairs. CCV 1-71 has a 
DNA homology of 90.8% to published FIPV strain WT 
WSU 1146, 93.2% identity with FIPV strain DF2 and 94.1% 
similarity with FECV. In comparison to WSU 1146, this 
CCV strain further contains two amino acid deletions at 
positions 11 and 12, and two amino acid insertions at 
positions 118 and 119. In comparison to the amino acid 
sequences of other coronavirus S genes, the amino acid 
sequence of CCV is 82.2% homologous to TGEV, 89.7% 
homologous to DF2-HP, 90.0% homologous to TS-BP, 
92.9% homologous to TS, 93.2% homologous to DF2, and 
94.1% homologous to FECV. 


The canine coronavirus S gene encoding amino acids 
#225-1325 [SEQ ID NO:54] has an overall homology to the 
published WT FIPV WSU 1146 strain at amino acids 352 to 
1454 of 95.9t. The homology level is increased to 97.5% 
when the comparison is done under the amino acid similarity 
rules as proposed by M. O Dayhoff, Atlas of Protein 
Sequence and Structure, Vol. 5, Supp. 3, Natl. Biomed. Res. 
Found., Washington, D.C. (1978). There are 42 amino acid 
differences between the CCV S gene and the published 
sequence of WSU 1146 strain within the CCV sequence of 
SEQ ID NO: 2. Other CCV fragment homologies with WT 
FIPV WSU 1146 are illustrated in Table II above. 


Numerous modifications and variations of the present 
invention are included in the above-identified specification 
and are expected to be obvious to one of skill in the art. Such 
modifications and alterations to the compositions and pro- 
cesses of the present invention are believed to be encom- 
passed in the scope of the claims appended hereto. 
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SEQUENCE LISTING 


(1) GENERAL INFORMATION: 


(iii) NUMBER OF SEQUENCES: 59 


(2) INFORMATION FOR SEQ ID NO: 1: 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4359 base pairs 
(B) TYPE: nucleic acid 
(C) STRANDEDNESS: double 
(D) TOPOLOGY: unknown 


(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..4356 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 


ATG ATT GTG CTC GTA ACT TGC CTC TTG TTT TCG TAC AAT AGT GTG ATT 48 
Met Ile Val Leu Val Thr Cys Leu Leu Phe Ser Tyr Asn Ser Val Ile 
1 5 10 15 
TGT ACA TCA AAC AAT GAC TGT GTA CAA GTT AAT GTG ACA CAA TTG CCT 96 
Cys Thr Ser Asn Asn Asp Cys Val Gln Val Asn Val Thr Gln Leu Pro 
20 25 30 
GGC AAT GAA AAC ATT ATT AAA GAT TIT CTA TTT CAC ACC TTC AAA GAA 144 
Gly Asn Glu Asn Ile Ile Lys Asp Phe Leu Phe His Thr Phe Lys Glu 
35 40 45 
GAA GGA AGT GTA GTT GTT GGT GGT TAT TAC CCT ACA GAG GTG TGG TAT 192 
Glu Gly Ser Val Val Val Gly Gly Tyr Tyr Pro Thr Glu Val Trp Tyr 
50 55 60 
AAC TGC TCC AGA AGC GCA ACA ACC ACC GCT TAC AAG GAT TTT AGT AAT 240 
Asn Cys Ser Arg Ser Ala Thr Thr Thr Ala Tyr Lys Asp Phe Ser Asn 
65 70 75 80 
ATA CAT GCA TTC TAT TTT GAT ATG GAA GCC ATG GAG AAT AGT ACT GGC 288 
Ile His Ala Phe Tyr Phe Asp Met Glu Ala Met Glu Asn Ser Thr Gly 
85 90 95 
AAT GCA CGA GGT AAA CCT TTA CTA GTA CAT GTT CAT GGT GAT CCT GTT 336 
Asn Ala Arg Gly Lys Pro Leu Leu Val His Val His Gly Asp Pro Val 
100 105 110 
AGT ATC ATC ATA TAT ATA TCG GCT TAT AGA GAT GAT GTG CAA GGA AGG 384 
Ser Ile Ile Ile Tyr Ile Ser Ala Tyr Arg Asp Asp Val Gln Gly Arg 
115 120 125 
ccT CTT TTA AAA CAT GGT TTG TTG TGT ATA ACT AAA AAT AAA ATC ATT 432 
Pro Leu Leu Lys His Gly Leu Leu Cys Ile Thr Lys Asn Lys Ile Ile 
130 135 140 
GAC TAT AAC ACG TTT ACC AGC GCA CAG TGG AGT GCC ATA TGT TTG GGT 480 
Asp Tyr Asn Thr Phe Thr Ser Ala Gln Trp Ser Ala Ile Cys Leu Gly 
145 150 155 160 
GAT GAC AGA AAA ATA CCA TTC TCT GTC ATA CCC ACA GGT AAT GGT ACA 528 
Asp Asp Arg Lys Ile Pro Phe Ser Val Ile Pro Thr Gly Asn Gly Thr 
165 170 175 
AAA ATA TIT GGT CTT GAG TGG AAT GAT GAC TAT GTT ACA GCC TAT ATT 576 
Lys Ile Phe Gly Leu Glu Trp Asn Asp Asp Tyr Val Thr Ala Tyr Ile 
180 185 190 
AGT GAT CGT TCT CAC CAT TTG AAC ATC AAT AAT AAT TGG TTT AAC AAT 624 
Ser Asp Arg Ser His His Leu Asn Ile Asn Asn Asn Trp Phe Asn Asn 
195 200 205 
GTG ACA ATC CTA TAC TCT CGA TCA AGC ACT GCT ACG TGG CAG AAG AGT 672 


Val Thr Ile Leu Tyr Ser Arg Ser Ser Thr Ala Thr Trp Gln Lys Ser 
210 215 220 


GcT 
Ala 
225 
AAT 


Asn 


TGC 
Cys 


TAT 
Tyr 


AGT 
Ser 


GTT 
val 
305 


TTT 
Phe 


Asn 


GTA 
Val 


GGT 
Gly 


AGT 
Ser 
385 
CGT 
Arg 


ACA 
Thr 


TTT 
Phe 


ATA 


Gct 


465 


ATT 





ToT 
Ser 


AGT 
Ser 





TAT 
Tyr 


GCA 
Ala 


AAC 
Asn 


TGC 
Cys 


ATA 
Ile 


Tce 
Ser 
290 


AAT 


Asn 


TGT 
Cys 


Asn 


CAA 
Gln 


GTC 
val 
370 


TTC 
Phe 


TAC 
Tyr 


TTA 
Leu 


TAT 
Tyr 


TcT 
Ser 
450 


TAC 
Tyr 


Lys 


CAA 
Gln 


GAA 
Glu 


TCA 
Ser 


TAT 


ACC 
Thr 


ACT 
Thr 





ccT 
Pro 
275 


ACG 
Thr 


TGT 
Cys 


TrT 
Phe 


ACA 
Thr 


TCT 
Ser 
355 


ATT 
Tle 


TAC 
Tyr 


TGT 
Cys 


CCA 
Pro 


ATT 
Ile 
435 


TIT 
Phe 


ACA 
Thr 


AAG 
Lys 


cTT 
Leu 


GTT 
val 
515 


CAT 
His 


GTT 
val 


AAT 
Asn 


GGC 
Gly 
260 


GAT 
Asp 


TTT 
Phe 


TTG 
Leu 


GAA 
Glu 


GTG 
val 
340 


GGT 
Gly 


cTT 
Leu 


AGT 
Ser 


TAC 
Tyr 


ccT 
Pro 
420 


AAT 
Asn 


AAT 
Asn 


TCG 
Ser 


GTG 
val 


ACT 
Thr 
500 


GGT 
Gly 


Acc 
Thr 


TAT 
Tyr 


GGC 
Gly 
245 


TAT 
Tyr 


GGC 
Gly 


GTT 
val 


TGG 
Trp 


GGT 
Gly 
325 


GAT 
Asp 


ATG 
Met 


GAG 
Glu 


TAT 
Tyr 


GCA 
Ala 
405 


AGT 
ser 


GGT 
Gly 


TTA 
Leu 


TAC 
Tyr 


ACG 
Thr 
485 


GcT 
Ala 


cTT 
Leu 


AGT 
Ser 


CAA 
Gln 
230 


TTIG 
Leu 


GcT 
Ala 


TTC 
Phe 


AGT 
Ser 


CCA 
Pro 
310 


GCG 
Ala 


GTC 
val 


GGT 
Gly 


ATT 
Ile 


GGT 
Gly 
390 


cTc 
Leu 


GTC 
val 


TAC 
Tyr 


Acc 
Thr 


AcT 
Thr 
470 


TAT 
Tyr 


AAT 
Asn 


GTC 
val 


GTT 
val 


19 


GGT 
Gly 


Lys 


Acc 
Thr 


AGT 
Ser 


GGC 
Gly 
295 


GTG 
val 


CAG 
Gln 


ATT 
Ile 


GcT 
Ala 


TCT 
Ser 
375 


GAA 
Glu 


TAT 
Tyr 


Lys 


AAT 
Asn 


AcT 
Thr 
455 


GAC 
Asp 


TGT 
Cys 


TTG 
Leu 


AAT 
Asn 


AAT 
Asn 


GTT 
Val 


AGC 
Ser 


AAC 
Asn 


TTT 
Phe 
280 


AGA 
Arg 


ccc 
Pro 


TTT 
Phe 


AGA 
Arg 


ACA 
Thr 
360 


TGT 
Cys 


ATT 
Ile 


Asn 


GAA 
Glu 


TTC 
Phe 
440 


GGT 
Gly 


GCA 
Ala 


AAC 
Asn 


CAA 
Gln 


AAG 
Lys 
520 


ATA 
Ile 


TCA 
Ser 


TAT 
Tyr 


GTA 
Val 
265 


AAC 
Asn 


TTT 
Phe 


AGT 
Ser 


AGC 
Ser 


TTC 
Phe 
345 


GTA 
val 


TAT 
Tyr 


TCA 
Ser 


GGC 
Gly 


ATT 
Ile 
425 


TTT 
Phe 


GAT 
Asp 


TTA 
Leu 


AGT 
Ser 


AAT 
Asn 
505 


AGT 


Ser 


ACT 
Thr 


AAT 
Asn 


GAA 
Glu 
250 


TTT 
Phe 


AAT 
Asn 


GTA 
Val 


cTT 
Leu 


CAA 
Gln 
330 


Asn 


TET. 
Phe 


Asn 


TTC 
Phe 


ACG 
Thr 
410 


GcT 
Ala 


AGC 
Ser 


AGT 
Ser 


GTA 
Val 


CAC 
His 
490 


Gly 


GTT 
val 


ATT 
Ile 


TTT 
Phe 
235 


TTG 


Leu 


GCC 
Ala 


TGG 
Trp 


ACA 
Thr 


Gly 
315 


TGT 
Cys 


cTT 
Leu 


TCA 
Ser 


GAT 
Asp 


Gly 
395 


GcT 
Ala 


ATT 
Tle 


AcT 
Thr 


ATT 
Ile 


TTT 
Phe 


GTG 
Val 


GAT 
Asp 
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-continued 


ACT 
Thr 


TGT 
Cys 


CCG 
Pro 


TTT 
Phe 


Asn 
300 


GTc 
Val 


AAT 
Asn 


AAT 
Asn 


cTG 
Leu 


ACA 
Thr 
380 


GTA 
val 


cTT 
Leu 


AGT 
Ser 


TTT 
Phe 


GCA 
Ala 
460 


GTT 
val 


AAT 
Asn 


TAT 
Tyr 


TTA 
Leu 


cTT 
Leu 


TAT 
Tyr 


GAA 
Glu 


ACA 
Thr 


ATG 
Met 
285 


CAA 
Gln 


GCA 
Ala 


GGT 
Gly 


Tre 
Phe 


AAT 
Asn 
365 


GTG 
val 


AcT 
Thr 


AAG 
Lys 


AAG 
Lys 


ccT 
Pro 
445 


TTT 
Phe 


GAA 
Glu 


AAC 
Asn 


ccT 
Pro 


CTA 
Leu 
525 


GGT 
Gly 


TAC 
Tyr 


GAT 
Asp 


GTG 
Val 
270 


cTT 
Leu 


CCA 
Pro 


Ala 


GTG 
val 


ACC 
Thr 
350 


ACA 
Thr 


AGT 
Ser 


GAT 
Asp 


TAT 
Tyr 


TGG 
Trp 
430 


ATT 
Ile 


TGG 
Trp 


AAC 
Asn 


ATT 
Ile 


GTT 
Val 
510 


ccT 


Pro 


ATG 
Met 


AAG 
Lys 


TAT 
Tyr 
255 


GGC 
Gly 


ACA 
Thr 


TTA 
Leu 


Gln 


TCT 
Ser 
335 


ACA 
Thr 


ACA 
Thr 


GAG 
Glu 


GGA 
Gly 


TTA 
Leu 
415 


GGC 
Gly 


GAT 
Asp 


ACA 
Thr 


ACA 
Thr 


Lys 
495 


GcT 
Ala 


AGT 
Ser 


AAG 
Lys 


TTA 
Leu 
240 


GAA 
Glu 


Gly 


AAC 
Asn 


TTG 
Leu 


Glu 
320 


TTA 


Leu 


GAT 
Asp 


GGT 
Gly 


TCA 
Ser 


CCG 
Pro 
400 


GGA 
Gly 


CAT 
His 


TGT 
Cys 


ATT 
Ile 


Ala 
480 


TGT 
Cys 


TCA 
Ser 


TTC 
Phe 


CGT 


720 


768 


816 


864 


912 


960 


1008 


1056 


1104 


1152 


1200 


1248 


1296 


1344 


1392 


1440 


1488 


1536 


1584 


1632 


20 


AGT 
Ser 
545 


CCA 


Pro 


TTT 
Phe 


GTG 
val 


Lys 


ACT 
Thr 
625 


Lys 


AGT 
Ser 


TCT 
Ser 


TGT 
Cys 


CAA 
Gln 
705 


GGT 
Gly 


GTC 
val 


ATA 
Ile 


CAT 
His 


Acc 
Thr 
785 


TGT 
Cys 


GcT 
Ala 


ATT 
Ile 


CAA 


530 


GGT 
Gly 


ATG 
Met 


TCA 
Ser 


TTT 
Phe 


ACT 
Thr 
610 


TTT 
Phe 


TTT 
Phe 


TTA 
Leu 


GAC 
Asp 


ACA 
Thr 
690 


ACT 
Thr 


GAC 
Asp 


ACG 
Thr 


GTT 
val 


TGG 
Trp 
770 


AAT 
Asn 


GAA 
Glu 


TTG 
Leu 


AGC 
Ser 


GTT 


TAT 
Tyr 


CAG 
Gln 


GTT 
val 


AAT 
Asn 
595 


GGT 
Gly 


AAC 
Asn 


GAT 
Asp 


TAT 
Tyr 


Asn 
675 


GAT 
Asp 


Asn 


TTG 
Leu 


CCA 
Pro 


GGA 
Gly 
755 


ACA 
Thr 


GAA 
Glu 


ccT 
Pro 


GTT 
val 


Acc 
Thr 
835 


GAG 


GGT 
Gly 


GAT 
Asp 


TAC 
Tyr 
580 


TCC 
Ser 


ACT 
Thr 


AAG 
Lys 


GTT 
val 


GTA 
val 
660 


AGT 
Ser 


TAT 
Tyr 


AGT 
Ser 


TTA 
Leu 


TGT 
Cys 
740 


GcT 
Ala 


ACA 
Thr 


AGG 
Arg 


ATC 
Ile 


TTT 
Phe 
820 


GGT 
Gly 


TAC 


CAA 
Gln 


AAT 
Asn 
565 


GTT 
val 


GAC 
Asp 


TGT 
Cys 


TTC 
Phe 


Gcc 
Ala 
645 


ATA 
Ile 


GGT 
Gly 


Asn 


ACG 
Thr 


GGG 
Gly 
725 


GAT 
Asp 


ATG 
Met 


ACA 
Thr 


AcT 
Thr 


ATA 
Ile 
805 


ATT 
Ile 


AAT 
Asn 


ATT 


ccc 
Pro 
550 


AAC 
Asn 


CAT 
His 


TGC 
Cys 


ccT 
Pro 


TGT 
Cys 
630 


GcT 
Ala 


TAT 
Tyr 


CTT 
Leu 


ATA 
Ile 


CTA 
Leu 
710 


TTT 
Phe 


GTA 
val 


AcT 
Thr 


ccT 
Pro 


CGT 
Arg 
790 


Acc 
Thr 


AAC 
Asn 


GTc 
val 


CAG 


21 


535 


ATA 
Ile 


Acc 
Thr 


Tcc 
Ser 


ACA 
Thr 


TTC 
Phe 
615 


TTG 


Leu 


CGT 
Arg 


GAA 
Glu 


CAC 
His 


TAT 
Tyr 
695 


CTT 
Leu 


Lys 


AGC 
ser 


TCC 
Ser 


AAT 
Asn 
775 


GGC 
Gly 


TAT 
Tyr 


GTc 
val 


ACG 
Thr 


GTT 


6cc 
Ala 


GAT 
Asp 


ACT 
Thr 


GAT 
Asp 
600 


TCG 


Ser 


TCA 
Ser 


ACA 
Thr 


Glu 


GAC 
Asp 
680 


GGT 
Gly 


AGT 
Ser 


Asn 


GCA 
Ala 


ATT 
Ile 
760 


TTT 
Phe 


ACA 
Thr 


TcT 
Ser 


ACA 
Thr 


ATA 
Ile 
840 


TAC 


TCA 
Ser 


GTG 
Val 


TGT 
Cys 
585 


GTT 
Val 


TTT. 
Phe 


TTG 
Leu 


AGA 
Arg 


GGA 
Gly 
665 


TTG 


Leu 


AGA 
Arg 


GGC 
Gly 


GTT 
val 


Gln 
745 


AAT 


Asn 


TAT 
Tyr 


GCA 
Ala 


AAT 
Asn 


CAT 
His 
825 


ccT 
Pro 


ACT 


ACA 
Thr 


TAC 
Tyr 
570 


Lys 


TTA 
Leu 


GAT 
Asp 


AAT 
Asn 


ACC 
Thr 
650 


GAC 
Asp 


TCA 
Ser 


ACT 
Thr 


TTA 
Leu 


AGT 
Ser 
730 


Ala 


AGT 
Ser 


TAT 
Tyr 


ATT 
Ile 


ATA 
Ile 
810 


TcT 
Ser 


ACA 
Thr 


ACA 


TTA 
Leu 
555: 


TGC 
Cys 


AGT 
Ser 


TAT 
Tyr 


Lys 


ccT 
Pro 
635 


AAT 


Asn 


AAC 
Asn 


GTG 
Val 


GGT 
Gly 


TAT 
Tyr 
715 
GAT 
Asp 


GcT 
Ala 


GAA 
Glu 


TAT 
Tyr 


GAT 
Asp 
795 


Gly 


GAT 
Asp 


Asn 


CCG 
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-continued 


540 


AGT 
Ser 


ATT 
Ile 


TcT 
Ser 


GcT 
Ala 


TTG 
Leu 
620 


GTT 
Val 


GAG 
Glu 


ATA 
Ile 


CTA 
Leu 


GTT 
val 
700 


TAC 
Tyr 


GGT 
Gly 


GTT 
Val 


ATG 
Met 


TCT 
Ser 
780 


AGT 
Ser 


GTT 
Val 


GGA 
Gly 


TTT 
Phe 


GTG 


AAC 
Asn 


CGT 
Arg 


TTA 
Leu 


ACA 
Thr 
605 


AAC 
Asn 


GGT 
Gly 


CAG 
Gln 


GTG 
val 


CAC 
His 
685 


GGT 
Gly 


ACA 
Thr 


GTC 
val 


ATT 
Tle 


TTA 
Leu 
765 


ATA 
Tle 


AAC 
Asn 


TGT 
Cys 


GAC 
Asp 


Acc 
Thr 
845 


ATC 
Ile 


TCr: 
Ser 


TGG 
Trp 
590 


Ala 


AAT 
Asn 


Ala 


GTT 
val 


GGT 
Gly 
670 


TTA 


Leu 


ATT 
Ile 


TCA 
Ser 


ATC 
Ile 


GAT 
Asp 
750 


Gly 


TAT 
Tyr 


GAT 
Asp 


Lys 


GTT 
Val 
830 


ATA 
Ile 


ATA 


ACA 
Thr 


AAC 
Asn 
575 


GAC 
Asp 


GTT 
Val 


TAC 
Tyr 


AAC 
Asn 


GTT 
val 
655 


GTG 
val 


GAC 
Asp 


ATT 
Ile 


CTA 
Leu 


TAT 
Tyr 
735 


Gly 


CTA 
Leu 


AAT 
Asn 


GTT 
Val 


AAT 
Asn 
815 


Gln 


TCT 
Ser 


GAT 


CTA 
Leu 
560 


CAA 
Gln 


GAT 
Asp 


ATA 
Ile 


TTA 
Leu 


TGC 
Cys 
640 


AGA 
Arg 


CCG 
Pro 


Tce 
Ser 


AGA 
Arg 


TCA 
Ser 
720 


TCT 
Ser 


Ala 


ACA 
Thr 


TAT 
Tyr 


GAT 
Asp 
800 


Gly 


CCA 
Pro 


GTG 
Val 


TGT 


1680 


1728 


1776 


1824 


1872 


1920 


1968 


2016 


2064 


2112 


2160 


2208 


2256 


2304 


2352 


2400 


2448 


2496 


2544 


2592 


22 


Gln 


TCA 
Ser 
865 


CAA 
Gln 


G6cc 
Ala 


AAT 
Asn 


TTA 
Leu 


GGA 
Gly 
945 


CGG 
Arg 


TTA 
Leu 


ATA 
Ile 


ccT 
Pro 


GCA 
Ala 


1025 


ccr 
Pro 


AcT 
Thr 


CAA 
Gln 


ATA 
Ile 


Lys 


1105 


ACA 
Thr 


GAC 
Asp 


AGG 
Arg 


val 
850 


AGG 
Arg 


TAC 
Tyr 


AGA 
Arg 


Gcc 
Ala 


GAT 
Asp 
930 


GGT 
Gly 


TCG 
Ser 


GGT 
Gly 


GcT 
Ala 


GGT 
Gly 


1010 


GGT 
Gly 


TTT 
Phe 


GAT 
Asp 


GcT 
Ala 


CAT 
His 


1090 


GTG 
Val 


GTA 
val 


ATT 





CcTG 
Leu 


Glu Tyr Ile Gln 


TAC GTT TGC AAT 
Tyr Val Cys Asn 
870 


GTT TCT GCA TGT 
Val Ser Ala Cys 
885 


CTT GAA AAC ATG 
Leu Glu Asn Met 
900 


CTT AAA TTG GCA 
Leu Lys Leu Ala 
915 


CCT ATT TAC AAA 
Pro Ile Tyr Lys 


TTA AAA GAC ATA 
Leu Lys Asp Ile 
950 


GCT ATA GAA GAT 
Ala Ile Glu Asp 
965 


ACA GTT GAT GAA 
Thr Val Asp Glu 
980 


GAC TTA GTG TGT 
Asp Leu Val Cys 
995 


GTA GCT AAT GAT 
Val Ala Asn Asp 


GGT ATA ACA TTA 
Gly Ile Thr Leu 


1030 


GCA ATA GCA GTT 
Ala Ile Ala Val 
1045 


GTA TTG AGC AAG 
Val Leu Ser Lys 
1060 


ATT GGT AAC ATT 
Ile Gly Asn Ile 
1075 


CAA ACG TCA CAA 
Gln Thr Ser Gln 


CAA GAT GTT GTT 
Gln Asp Val Val 


1110 


CAA TTG CAA AAT 
Gln Leu Gln Asn 
1125 


TAT AAC AGG CTT 
Tyr Asn Arg Leu 
1140 


ATT ACA GGA AGA 
Ile Thr Gly Arg 
1155 


23 


val 
855 


GGT 
Gly 


CAA 
Gln 


GAG 
Glu 


TcT 
Ser 


GAA 
Glu 
935 


TTG 


Leu 


TTG 
Leu 


GAT 
Asp 


GCA 
Ala 


GAC 
Asp 


1015 


GGT 
Gly 


CAA 
Gln 


AAC 
Asn 


ACA 
Thr 


GGT 
Gly 


1095 


AAC 
Asn 


AAT 
Asn 


GAT 
Asp 


cTT 
Leu 


US 6,372,224 B1 


Tyr Thr Thr Pro 


AAC CCT AGA TGC 
Asn Pro Arg Cys 
875 


ACT ATT GAG CAA 
Thr Ile Glu Gln 
890 


ATT GAT TCC ATG 
Ile Asp Ser Met 
905 


GTT GAA GCA TTC 
Val Glu Ala Phe 
920 


TGG CCT AAC ATT 
Trp Pro Asn Ile 


CCA TCT CAC AAC 
Pro Ser His Asn 
955 


CTT TTT GAT AAG 
Leu Phe Asp Lys 
970 


TAT AAA CGT TGT 
Tyr Lys Arg Cys 
985 


CAA TAT TAC AAT 
Gln Tyr Tyr Asn 
1000 


AAG ATG GCT ATG 
Lys Met Ala Met 


GCA CTT GGT GGT 
Ala Leu Gly Gly 
1035 


GCC AGA CTT AAT 
Ala Arg Leu Asn 
1050 


CAG CAG ATC CTG 
Gln Gln Ile Leu 
1065 


CAG GCA TIT GGT 
Gln Ala Phe Gly 
1080 


CTT GCT ACT GTT 
Leu Ala Thr Val 


ACA CAA GGG CAA 
Thr Gln Gly Gln 
LL15 


TTC CAA GCC ATT 
Phe Gln Ala Ile 
1130 


GAA TTG AGT GCT 
Glu Leu Ser Ala 
1145 


ACA GCA CTT AAT 
Thr Ala Leu Asn 
1160 


-continued 


val 
860 


AAT 
Asn 


GCA 
Ala 


TTG 
Leu 


AAT 
Asn 


GGT 
Gly 
940 


AGC 
Ser 


GTT 
val 


ACA 
Thr 


GGC 
Gly 


TAC 
Tyr 
1020 


GGC 
Gly 


TAT 
Tyr 


GcT 
Ala 


AAG 
Lys 


GcT 
Ala 
1100 


GcT 
Ala 


AGT 
Ser 


GAT 
Asp 


GCA 
Ala 


Ser Ile Asp Cys 


AAA TTG TTA ACG 
Lys Leu Leu Thr 
880 


CTT GCA ATG GGT 
Leu Ala Met Gly 
895 


TTT GTT TCG GAA 
Phe Val Ser Glu 
910 


AGT ACG GAA ACT 
Ser Thr Glu Thr 
925 


GGT TCT TGG CTA 
Gly Ser Trp Leu 


AAA CGT AAG TAC 
Lys Arg Lys Tyr 
960 


GTA ACA TCT GGC 
Val Thr Ser Gly 
975 


GGT GGT TAT GAC 
Gly Gly Tyr Asp 
990 


ATC ATG GTG CTA 
Ile Met Val Leu 
1005 


ACT GCA TCT CTT 
Thr Ala Ser Leu 


GCA GTG TCT ATA 
Ala Val Ser Ile 
1040 


GTT GCT CTA CAA 
Val Ala Leu Gln 
1055 


AAT GCT TTC AAT 
Asn Ala Phe Asn 
1070 


GTT AAT GAT GCT 
Val Asn Asp Ala 
1085 


AAA GCA TTG GCA 
Lys Ala Leu Ala 


TTA AGC CAC CTA 
Leu Ser His Leu 
1120 


AGT TCC ATT AGT 
Ser Ser Ile Ser 
1135 


GCA CAA GTT GAC 
Ala Gln Val Asp 
1150 


TTT GTG TCT CAG 
Phe Val Ser Gln 
1165 


2640 


2688 


2736 


2784 


2832 


2880 


2928 


2976 


3024 


3072 


3120 


3168 


3216 


3264 


3312 


3360 


3408 


3456 


3504 


24 


ACT TTA 
Thr Leu 


1170 


GAC AAG 
Asp Lys 
1185 


TGT GGT 
Cys Gly 


GGC ATG 
Gly Met 


GTG ACG 
Val Thr 


GGA CTT 
Gly Leu 


1250 


GAC AAA 
Asp Lys 
1265 


ACT AGT 
Thr Ser 


AAT GCA 
Asn Ala 


ATT AAT 
Ile Asn 


ACT GTA 
Thr Val 


1330 


CTG ACT 
Leu Thr 
1345 


AAC ACC 
Asn Thr 


TTA GTC 
Leu Val 


CCT TGG 
Pro Trp 


ccc ATA 
Pro Ile 


1410 


GGG TGT 
Gly Cys 
1425 


GAA AGT 
Glu Ser 


ACC 
Thr 


GTA 
val 


AAT 
Asn 


ATC 
Ile 


6cc 
Ala 


1235 


GTT 
Val 


TTC 
Phe 


TCT 
Ser 


ACT 
Thr 


CAA 
Gln 


1315 


ccT 
Pro 


GGT 
Gly 


ACA 
Thr 


AAT 
Asn 


TAT 
Tyr 


1395 


TTG 
Leu 


TTA 
Leu 


TAT 
Tyr 


AGA 
Arg 


AAT 
Asn 


GGT 
Gly 


TTC 
Phe 


1220 


TGG 
Trp 


GTT 
Val 


TAT 
Tyr 


GAT 
Asp 


GTA 
val 


1300 


ACT 
Thr 


GAG 
Glu 


GAA 
Glu 


GTA 
val 


cTT 
Leu 


1380 


GTG 
val 


CTA 
Leu 


GGA 
Gly 


GAA 
Glu 


(2) INFORMATION 


(i) SEQUENC 
(A) LENGTH: 
(B) TYPE: 


(D) TO 


CAA 
Gln 


GAA 
Glu 


ACA 
Thr 


1205 


TTT 
Phe 


TCA 
Ser 


AAG 
Lys 


TTG 
Leu 


TTT 
Phe 


1285 


ATT 
Ile 


GTT 
val 


TTIG 
Leu 


ATT 
Ile 


GAA 
Glu 


1365 


GAA 
Glu 


TGG 
Trp 


TTT 
Phe 


AGC 
Ser 


CCA 
Pro 





1445 


POLOGY : 


25 


GCA GAG 
Ala Glu 


1175 


TGC GTT 
Cys Val 
1190 


CAT TTA 


His Leu 


CAC ACA 
His Thr 


GGT ATT 
Gly Ile 


GAT GTC 
Asp Val 


1255 


ACT ccc 
Thr Pro 
1270 


GTT CAA 
val Gln 


GAC TTG 
Asp Leu 


CAG GAC 
Gln Asp 


CCA CTT 
Pro Leu 


1335 


AAT GAC 
Asn Asp 
1350 


cTT GCT 
Leu Ala 


TGG CTC 
Trp Leu 


CTA CTA 
Leu Leu 


TGT TGT 
Cys Cys 


1415 


TGT TGT 
Cys Cys 
1430 


ATT GAA 
Ile Glu 


GTT 
Val 


AGG 
Arg 


TET 
Phe 


GTG 
Val 


TGT 
Cys 


1240 


CAG 
Gln 


AGA 
Arg 


ATT 
Tle 


ccT 
Pro 


ATA 
Ile 


1320 


GAC 
Asp 


TTA 
Leu 


ATT 
Ile 


AAT 
Asn 


ATT 
Ile 


1400 


TGT 
Cys 


CAT 
His 


AAA 
Lys 


FOR SEQ ID NO: 


amino acid 


linear 


AGG 
Arg 


TcT 
Ser 


TCA 
Ser 


CTA 
Leu 


1225 


GCA 
Ala 


TTG 
Leu 


ACT 
Thr 


GAA 
Glu 


AGT 
Ser 


1305 


TTA 
Leu 


ATT 
Ile 


GAA 
Glu 


cTc 
Leu 


AGA 
Arg 


1385 


GGA 
Gly 


AGC 
Ser 


TCC 
Ser 


GTG 
val 


2: 


E CHARACTERISTICS: 
1452 amino acids 


GcT 
Ala 


CAA 
Gln 


cTT 
Leu 
1210 


TTA 
Leu 


TCA 
Ser 


ACG 
Thr 


ATG 
Met 


GGA 
Gly 
1290 


ATT 
Ile 


GAA 
Glu 


TTC 
Phe 


Tre. 
Phe 


ATT 
Ile 
1370 


ATT 
Ile 


TTA 
Leu 


ACT 
Thr 


ATA 
Ile 


CAT 
His 
1450 
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AGC AGA CAG 
Ser Arg Gln 
1180 


TCT CAG AGA 
Ser Gln Arg 
1195 


GCA AAT GCA 
Ala Asn Ala 


CCA ACA GCT 
Pro Thr Ala 


GAT GGC GAT 
Asp Gly Asp 
1245 


cTG TTT CGC 
Leu Phe Arg 
1260 


TAT CAG CCT 
Tyr Gln Pro 
1275 


TGT GAT GTG 
Cys Asp Val 


ATA CCT GAC 
Ile Pro Asp 


AAT TTC AGA 
Asn Phe Arg 
1325 


AAT GCA ACC 
Asn Ala Thr 
1340 


AGG TCA GAA 
Arg Ser Glu 
1355 


GAT AAT ATT 
Asp Asn Ile 


GAA ACT TAT 
Glu Thr Tyr 


cTT 
Leu 


TTT 
Phe 


Ala 


TAT 
Tyr 
1230 


CGT 
Arg 


AAT 
Asn 


AGA 
Arg 


TTG 
Leu 


TAT 
Tyr 
1310 


CCA 


Pro 


TAC 
Tyr 


Lys 


AAT 
Asn 


GTA 
Val 
1390 


GCT AAA 
Ala Lys 


GGA TTC 
Gly Phe 
1200 


CCA AAT 
Pro Asn 
1215 


GAA ACC 
Glu Thr 


ACT TTT 
Thr Phe 


CTA GAT 
Leu Asp 


GTT GCA 
Val Ala 
1280 


TTT GTT 
Phe Val 
1295 


ATT GAT 
Ile Asp 


AAT TGG 
Asn Trp 


TTA AAC 
Leu Asn 


TTA CAT 
Leu His 
1360 


AAC ACA 
Asn Thr 
1375 


AAA TGG 
Lys Trp 


GTA GTA ATA TTC TGC ATA 
Val Val Ile Phe Cys Ile 


1405 


GGT TGT TGT GGA TGT ATT 
Gly Cys Cys Gly Cys Ile 


1420 


TGT AGT AGA AGG CGA TIT 
Cys Ser Arg Arg Arg Phe 


1435 


GTC CAC TAA 
Val His 


1440 


3552 


3600 


3648 


3696 


3744 


3792 


3840 


3888 


3936 


3984 


4032 


4080 


4128 


4176 


4224 


4272 


4320 


4359 


26 


Met 


Cys 


Gly 


Glu 


Asn 


65 


Ile 


Asn 


Ser 


Pro 


Asp 


145 


Asp 


Lys 


Ser 


Val 


Ala 


225 


Asn 


Cys 


Tyr 


ser 


val 


305 


Phe 


Asn 


val 


Gly 


ser 
385 


27 


(ii) MOLECULE TYPE: protein 
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-continued 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 


Ile Val Leu Val 


Thr 


Asn 


Gly 


50 


Cys 


His 


Ala 


Ile 


Leu 


130 


Tyr 


Asp 


Ile 


Asp 


Thr 


210 


Ala 


Asn 


Cys 


Ile 


Ser 


290 


Asn 


Cys 


Asn 


Gln 


val 


370 


Phe 


Ser 


Glu 


35 


Ser 


Ser 


Ala 


Arg 


Ile 


115 


Leu 


Asn 


Arg 


Phe 


Arg 


195 


Tle 


Tyr 


Thr 


Thr 


Pro 


275 


Thr 


Cys 


Phe 


Thr 


Ser 


355 


Ile 


Tyr 


Asn 


20 


Asn 


val 


Arg 


Phe 


Gly 


100 


Ile 


Lys 


Thr 


Lys 


Gly 


180 


Ser 


Leu 


val 


Asn 


Gly 


260 


Asp 


Phe 


Leu 


Glu 


val 


340 


Gly 


Leu 


Ser 


5 


Asn 


Ile 


val 


Ser 


Tyr 


85 


Lys 


Tyr 


His 


Phe 


Ile 


165 


Leu 


His 


Tyr 


Tyr 


Gly 


245 


Tyr 


Gly 


val 


Trp 


Gly 


325 


Asp 


Met 


Glu 


Tyr 


Thr Cys Leu Leu Phe 


Asp 


Ile 


val 


Ala 


70 


Phe 


Pro 


Tle 


Gly 


Thr 


150 


Pro 


Glu 


His 


Ser 


Gln 


230 


Leu 


Ala 


Phe 


Ser 


Pro 


310 


Ala 


val 


Gly 


Ile 


Gly 
390 


Cys 


Lys 


Gly 


55 


Thr 


Asp 


Leu 


Ser 


Leu 


135 


Ser 


Phe 


Trp 


Leu 


Arg 


215 


Gly 


Lys 


Thr 


Ser 


Gly 


295 


val 


Gln 


Ile 


Ala 


Ser 


375 


Glu 


Val 


Asp 


40 


Gly 


Thr 


Met 


Leu 


Ala 


120 


Leu 


Ala 


Ser 


Asn 


Asn 


200 


Ser 


val 


Ser 


Asn 


Phe 


280 


Arg 


Pro 


Phe 


Arg 


Thr 


360 


Cys 


Ile 


Gln 


25 


Phe 


Tyr 


Thr 


Glu 


Val 


105 


Tyr 


Cys 


Gln 


val 


Asp 


185 


Ile 


Ser 


Ser 


Tyr 


val 


265 


Asn 


Phe 


Ser 


Ser 


Phe 


345 


val 


Tyr 


Ser 


10 


Val 


Leu 


Tyr 


Ala 


Ala 


90 


His 


Arg 


Tle 


Trp 


Tle 


170 


Asp 


Asn 


Thr 


Asn 


Glu 


250 


Phe 


Asn 


Val 


Leu 


Gln 


330 


Asn 


Phe 


Asn 


Phe 


Ser 


Asn 


Phe 


Pro 


Tyr 


75 


Met 


Val 


Asp 


Thr 


Ser 


155 


Pro 


Tyr 


Asn 


Ala 


Phe 


235 


Leu 


Ala 


Trp 


Thr 


Gly 


315 


Cys 


Leu 


Ser 


Asp 


Gly 
395 


Tyr 


Val 


His 


Thr 


60 


Lys 


Glu 


His 


Asp 


Lys 


140 


Ala 


Thr 


val 


Asn 


Thr 


220 


Thr 


Cys 


Pro 


Phe 


Asn 


300 


Val 


Asn 


Asn 


Leu 


Thr 


380 


Val 


Asn 


Thr 


Thr 


45 


Glu 


Asp 


Asn 


Gly 


val 


125 


Asn 


Ile 


Gly 


Thr 


Trp 


205 


Trp 


Tyr 


Glu 


Thr 


Met 


285 


Gln 


Ala 


Gly 


Phe 


Asn 


365 


Val 


Thr 


Ser 


Gln 


30 


Phe 


Val 


Phe 


Ser 


Asp 


110 


Gln 


Lys 


Cys 


Asn 


Ala 


190 


Phe 


Gln 


Tyr 


Asp 


Val 


270 


Leu 


Pro 


Ala 


Val 


Thr 


350 


Thr 


Ser 


Asp 


Val 


15 


Leu 


Lys 


Trp 


Ser 


Thr 


95 


Pro 


Gly 


Ile 


Leu 


Gly 


175 


Tyr 


Asn 


Lys 


Lys 


Tyr 


255 


Gly 


Thr 


Leu 


Gln 


Ser 


335 


Thr 


Thr 


Glu 


Gly 


Ile 


Pro 


Glu 


Tyr 


Asn 


80 


Gly 


Val 


Arg 


Tle 


Gly 


160 


Thr 


Ile 


Asn 


Ser 


Leu 


240 


Glu 


Gly 


Asn 


Leu 


Glu 


320 


Leu 


Asp 


Gly 


Ser 


Pro 
400 


28 


Arg 


Thr 


Phe 


Ile 


Ala 


465 


Ile 


Ser 


Ser 


Tyr 


Ser 


545 


Pro 


Phe 


Val 


Lys 


Thr 


625 


Lys 


ser 


ser 


Cys 


Gln 


705 


Gly 


val 


Ile 


His 


Thr 


785 


Cys 


Tyr 


Leu 


Tyr 


Ser 


450 


Tyr 


Lys 


Gln 


Glu 


Ser 


530 


Gly 


Met 


Ser 


Phe 


Thr 


610 


Phe 


Phe 


Leu 


Asp 


Thr 


690 


Thr 


Asp 


Thr 


Val 


Trp 


770 


Asn 


Glu 


Cys 


Pro 


Ile 


435 


Phe 


Thr 


Lys 


Leu 


val 


515 


His 


Tyr 


Gln 


val 


Asn 


595 


Gly 


Asn 


Asp 


Tyr 


Asn 


675 


Asp 


Asn 


Leu 


Pro 


Gly 


755 


Thr 


Glu 


Pro 


Tyr 


Pro 


420 


Asn 


Asn 


Ser 


val 


Thr 


500 


Gly 


Thr 


Gly 


Asp 


Tyr 


580 


Ser 


Thr 


Lys 


val 


val 


660 


Ser 


Tyr 


Ser 


Leu 


Cys 


740 


Ala 


Thr 


Arg 


Ile 


Ala 


405 


Ser 


Gly 


Leu 


Tyr 


Thr 


485 


Ala 


Leu 


Ser 


Gln 


Asn 


565 


val 


Asp 


Cys 


Phe 


Ala 


645 


Ile 


Gly 


Asn 


Thr 


Gly 


725 


Asp 


Met 


Thr 


Thr 


Ile 
805 


Leu 


val 


Tyr 


Thr 


Thr 


470 


Tyr 


Asn 


val 


val 


Pro 


550 


Asn 


His 


Cys 


Pro 


Cys 


630 


Ala 


Tyr 


Leu 


Ile 


Leu 


710 


Phe 


val 


Thr 


Pro 


Arg 


790 


Thr 


29 


Tyr 
Lys 
Asn 
Thr 
455 
Asp 
Cys 
Leu 
Asn 
Asn 
535 
Ile 
Thr 
Ser 
Thr 
Phe 
615 
Leu 
Arg 
Glu 
His 
Tyr 
695 
Leu 
Lys 
Ser 
Ser 
Asn 
775 


Gly 


Tyr 


Asn 


Glu 


Phe 


440 


Gly 


Ala 


Asn 


Gln 


Lys 


520 


Ile 


Ala 


Asp 


Thr 


Asp 


600 


Ser 


Ser 


Thr 


Glu 


Asp 


680 


Gly 


Ser 


Asn 


Ala 


Ile 


760 


Phe 


Thr 


Ser 


Gly 


Ile 


425 


Phe 


Asp 


Leu 


Ser 


Asn 


505 


Ser 


Thr 


Ser 


val 


Cys 


585 


val 


Phe 


Leu 


Arg 


Gly 


665 


Leu 


Arg 


Gly 


val 


Gln 


745 


Asn 


Tyr 


Ala 


Asn 


Thr 


410 


Ala 


Ser 


Ser 


Val 


His 


490 


Gly 


Val 


Ile 


Thr 


Tyr 


570 


Lys 


Leu 


Asp 


Asn 


Thr 


650 


Asp 


Ser 


Thr 


Leu 


Ser 


730 


Ala 


Ser 


Tyr 


Ile 


Ile 
810 


Ala 


Ile 


Thr 


Gly 


Gln 


475 


Ile 


Phe 


Val 


Asp 


Leu 


555 


Cys 


Ser 


Tyr 


Lys 


Pro 


635 


Asn 


Asn 


Val 


Gly 


Tyr 


715 


Asp 


Ala 


Glu 


Tyr 


Asp 


795 


Gly 
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Leu 


Ser 


Phe 


Ala 


460 


Val 


Asn 


Tyr 


Leu 


Leu 


540 


Ser 


Ile 


Ser 


Ala 


Leu 


620 


val 


Glu 


Ile 


Leu 


Val 


700 


Tyr 


Gly 


val 


Met 


Ser 


780 


Ser 


Val 


Lys 


Lys 


Pro 


445 


Phe 


Glu 


Asn 


Pro 


Leu 


525 


Gly 


Asn 


Arg 


Leu 


Thr 


605 


Asn 


Gly 


Gln 


Val 


His 


685 


Gly 


Thr 


Val 


Ile 


Leu 


765 


Ile 


Asn 


Cys 


Tyr 


TEep. 


430 


Ile 


Trp 


Asn 


Ile 


Val 


510 


Pro 


Met 


Ile 


Ser 


Trp 


590 


Ala 


Asn 


Ala 


val 


Gly 


670 


Leu 


Ile 


Ser 


Ile 


Asp 


750 


Gly 


Tyr 


Asp 


Lys 


Leu 


415 


Gly 


Asp 


Thr 


Thr 


Lys 


495 


Ala 


Ser 


Lys 


Thr 


Asn 


575 


Asp 


val 


Tyr 


Asn 


val 


655 


val 


Asp 


Ile 


Leu 


Tyr 


735 


Gly 


Leu 


Asn 


Val 


Asn 
815 


Gly 


His 


Cys 


Ile 


Ala 


480 


Cys 


Ser 


Phe 


Leu 


560 


Gln 


Asp 


Tle 


Leu 


Cys 


640 


Arg 


Pro 


Ser 


Ser 


720 


Ser 


Ala 


Thr 


Tyr 


Asp 


800 


Gly 


30 


Ala 


Ile 


Gln 


Ser 


865 


Gln 


Ala 


Asn 


Leu 


Gly 


945 


Arg 


Leu 


Tle 


Pro 


Leu 


Ser 


val 


850 


Arg 


Tyr 


Arg 


Ala 


Asp 


930 


Gly 


Ser 


Gly 


Ala 


Gly 


1010 


val 


Thr 


835 


Glu 


Tyr 


val 


Leu 


Leu 


915 


Pro 


Leu 


Ala 


Thr 


Asp 


995 


val 


Ala Gly Gly 


1025 


Pro Phe Ala 


Thr Asp Val 


Phe 


820 


Gly 


Tyr 


val 


Ser 


Glu 


900 


Lys 


Ile 


Lys 


Tle 


val 


980 


Leu 


Ala 


Ile 


Ile 


31 


Ile Asn Val Thr His 
825 


Asn Val Thr Ile Pro 
840 


Ile Gln Val Tyr Thr 
855 


Cys Asn Gly Asn Pro 
870 


Ala Cys Gln Thr Ile 
885 


Asn Met Glu Ile Asp 
905 


Leu Ala Ser Val Glu 
920 


Tyr Lys Glu Trp Pro 
935 


Asp Ile Leu Pro Ser 
950 


Glu Asp Leu Leu Phe 
965 


Asp Glu Asp Tyr Lys 
985 


Val Cys Ala Gln Tyr 
1000 


Asn Asp Asp Lys Met 
1015 


Thr Leu Gly Ala Leu 
1030 


Ala Val Gln Ala Arg 
1045 


Ser 


Thr 


Thr 


Arg 


Glu 


890 


Ser 


Ala 


Asn 


His 


Asp 


970 


Arg 


Tyr 


Ala 


Gly 
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Asp 


Asn 


Pro 


Cys 


875 


Gln 


Met 


Phe 


Ile 


Asn 


955 


Lys 


Cys 


Asn 


Met 


32 


-continued 


Gly 


Phe 


Val 


860 


Asn 


Ala 


Leu 


Asn 


Gly 


940 


Ser 


val 


Thr 


Gly 


Tyr 
1020 


Asp Val Gln Pro 
830 


Thr Ile Ser Val 
845 


Ser Ile Asp Cys 
Lys Leu Leu Thr 
880 


Leu Ala Met Gly 
895 


Phe Val Ser Glu 
910 


Ser Thr Glu Thr 
925 


Gly Ser Trp Leu 
Lys Arg Lys Tyr 
960 


Val Thr Ser Gly 
975 


Gly Gly Tyr Asp 
990 


Ile Met Val Leu 
1005 


Thr Ala Ser Leu 


Gly Gly Ala Val Ser Ile 


1035 


1040 


Leu Asn Tyr Val Ala Leu Gln 


1050 


1055 


Leu Ser Lys Asn Gln Gln Ile Leu Ala Asn Ala Phe Asn 


1060 


1065 


1070 


Gln Ala Ile Gly Asn Ile Thr Gln Ala Phe Gly Lys Val Asn Asp Ala 


1075 


Ile His Gln 


1090 


Lys Val Gln 


1105 


Thr 


Asp 


Arg 


Thr 


val 


Ile 


Leu Ile Thr Gly Arg 
1155 


Leu 


1170 


Gln 


Tyr 


Thr 


Asp Lys Val 


1185 


Cys Gly Asn 


Gly Met Ile 


Val Thr Ala 


Thr 


Asp 


Leu 


Asn 


1140 


Arg 


Asn 


Gly 


Phe 


1220 


Trp 


1080 


1085 


Ser Gln Gly Leu Ala Thr Val Ala Lys Ala Leu Ala 


1095 


1100 


Val Val Asn Thr Gln Gly Gln Ala Leu Ser His Leu 


1110 


1115 


1120 


Gln Asn Asn Phe Gln Ala Ile Ser Ser Ser Ile Ser 


1125 


1130 


1135 


Arg Leu Asp Glu Leu Ser Ala Asp Ala Gln Val Asp 


1145 


1160 


1150 


Leu Thr Ala Leu Asn Ala Phe Val Ser Gln 


1165 


Gln Ala Glu Val Arg Ala Ser Arg Gln Leu Ala Lys 


1175 


1180 


Glu Cys Val Arg Ser Gln Ser Gln Arg Phe Gly Phe 


1190 


1195 


1200 


Thr His Leu Phe Ser Leu Ala Asn Ala Ala Pro Asn 


1205 


1210 


1215 


Phe His Thr Val Leu Leu Pro Thr Ala Tyr Glu Thr 


1225 


1230 


Ser Gly Ile Cys Ala Ser Asp Gly Asp Arg Thr Phe 
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1235 1240 


1245 


34 


Gly Leu Val Val Lys Asp Val Gln Leu Thr Leu Phe Arg Asn Leu Asp 


1250 1255 1260 


Asp Lys Phe Tyr Leu Thr Pro Arg Thr Met Tyr Gln Pro Arg Val Ala 


1265 1270 1275 


1280 


Thr Ser Ser Asp Phe Val Gln Ile Glu Gly Cys Asp Val Leu Phe Val 


1285 1290 


1295 


Asn Ala Thr Val Ile Asp Leu Pro Ser Ile Ile Pro Asp Tyr Ile Asp 


1300 1305 


1310 


Ile Asn Gln Thr Val Gln Asp Ile Leu Glu Asn Phe Arg Pro Asn Trp 


1315 1320 


1325 


Thr Val Pro Glu Leu Pro Leu Asp Ile Phe Asn Ala Thr Tyr Leu Asn 


1330 1335 1340 


Leu Thr Gly Glu Ile Asn Asp Leu Glu Phe Arg Ser Glu Lys Leu His 


1345 1350 1355 


1360 


Asn Thr Thr Val Glu Leu Ala Ile Leu Ile Asp Asn Ile Asn Asn Thr 


1365 1370 


1375 


Leu Val Asn Leu Glu Trp Leu Asn Arg Ile Glu Thr Tyr Val Lys Trp 


1380 1385 


1390 


Pro Trp Tyr Val Trp Leu Leu Ile Gly Leu Val Val Ile Phe Cys 


1395 1400 


1405 


Pro Ile Leu Leu Phe Cys Cys Cys Ser Thr Gly Cys Cys Gly Cys 


1410 1415 1420 


Gly Cys Leu Gly Ser Cys Cys His Ser Ile Cys Ser Arg Arg Arg 


1425 1430 1435 
Glu Ser Tyr Glu Pro Ile Glu Lys Val His Val His 
1445 1450 
(2) INFORMATION FOR SEQ ID NO: 3: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 201 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 


Gly Ser Val Val Val Gly Gly Tyr Tyr Pro Thr Glu 
1 5 10 


Cys Ser Arg Ser Ala Thr Thr Thr Ala Tyr Lys Asp 
20 25 


His Ala Phe Tyr Phe Asp Met Glu Ala Met Glu Asn 
35 40 


Ala Arg Gly Lys Pro Leu Leu Val His Val His Gly 
50 55 60 


Ile Ile Ile Tyr Ile Ser Ala Tyr Arg Asp Asp Val 
65 70 715: 


Leu Leu Lys His Gly Leu Leu Cys Ile Thr Lys Asn 
85 90 


Tyr Asn Thr Phe Thr Ser Ala Gln Trp Ser Ala Ile 
100 105 


Asp Arg Lys Ile Pro Phe Ser Val Ile Pro Thr Gly 
115 120 


Ile Phe Gly Leu Glu Trp Asn Asp Asp Tyr Val Thr 
130 135 140 


Val 


Phe 


Ser 


45 


Asp 


Gln 


Lys 


Cys 


Asn 


125 


Ala 


Trp 


Ser 


30 


Thr 


Pro 


Gly 


Ile 


Leu 


110 


Gly 


Tyr 


Tyr 


15 


Asn 


Gly 


Val 


Arg 


Ile 


95 


Gly 


Thr 


Ile 


Ile 


Ile 


Phe 
1440 


As 


Tl 


As 


Se 


Pr 


80 


As 


As 


Ly 


Se 
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Asp Arg Ser His His Leu Asn Ile Asn Asn Asn Trp Phe Asn Asn Va 
145 150 155 160 


Thr Ile Leu Tyr Ser Arg Ser Ser Thr Ala Thr Trp Gln Lys Ser Al 
165 170 175 


Ala Tyr Val Tyr Gln Gly Val Ser Asn Phe Thr Tyr Tyr Lys Leu As 
180 185 190 


Asn Thr Asn Gly Leu Lys Ser Tyr Glu 
195 200 





(2) INFORMATION FOR SEQ ID NO: 4: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 51 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 


Ser Cys Tyr Asn Asp Thr Val Ser Glu Ser Ser Phe Tyr Ser Tyr Gl 
1 5 10 15 


Glu Ile Ser Phe Gly Val Thr Asp Gly Pro Arg Tyr Cys Tyr Ala Le 
20 25 30 


Tyr Asn Gly Thr Ala Leu Lys Tyr Leu Gly Thr Leu Pro Pro Ser Va 
35 40 45 


Lys Glu Ile 
50 
(2) INFORMATION FOR SEQ ID NO: 5: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 


Ser Phe Asn Leu Thr Thr Gly Asp Ser Gly Ala Phe Trp Thr Ile Al 
1 bs) 10 15 


Tyr Thr Ser Tyr Thr 
20 
(2) INFORMATION FOR SEQ ID NO: 6: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 51 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 


Pro Ile Ala Ser Thr Leu Ser Asn Ile Thr Leu Pro Met Gln Asp As 
1 5 10 15 


Asn Thr Asp Val Tyr Cys Ile Arg Ser Asn Gln Phe Ser Val Tyr Va 
20 25 30 


His Ser Thr Cys Lys Ser Ser Leu Trp Asp Asp Val Phe Asn Ser As 
35 40 45 


Cys Thr Asp 
50 
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(2) INFORMATION FOR SEQ ID NO: 7: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 51 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 


Thr Asn Glu Gln Val Val Arg Ser Leu Tyr Val Ile Tyr Glu Glu Gl 
1 5 10 15 


Asp Asn Ile Val Gly Val Pro Ser Asp Asn Ser Gly Leu His Asp Le 
20 25 30 


Ser Val Leu His Leu Asp Ser Cys Thr Asp Tyr Asn Ile Tyr Gly Ar 
35 40 45 


Thr Gly Val 
50 
(2) INFORMATION FOR SEQ ID NO: 8: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 81 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 


Trp Thr Thr Thr Pro Asn Phe Tyr Tyr Tyr Ser Ile Tyr Asn Tyr Th 
1 5 10 15 


Asn Glu Arg Thr Arg Gly Thr Ala Ile Asp Ser Asn Asp Val Asp Cy 
20 25 30 


Glu Pro Ile Ile Thr Tyr Ser Asn Ile Gly Val Cys Lys Asn Gly Al 
35 40 45 


Leu Val Phe Ile Asn Val Thr His Ser Asp Gly Asp Val Gln Pro Il 
50 55 60 


Ser Thr Gly Asn Val Thr Ile Pro Thr Asn Phe Thr Ile Ser Val Gl 
65 70 75 80 


val 


(2) INFORMATION FOR SEQ ID NO: 9: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 126 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 


Glu Asn Met Glu Ile Asp Ser Met Leu Phe Val Ser Glu Asn Ala Le 
1 5 10 15 


Lys Leu Ala Ser Val Glu Ala Phe Asn Ser Thr Glu Thr Leu Asp Pr 
20 25 30 


Ile Tyr Lys Glu Trp Pro Asn Ile Gly Gly Ser Trp Leu Gly Gly Le 
35 40 45 


Lys Asp Ile Leu Pro Ser His Asn Ser Lys Arg Lys Tyr Arg Ser Al 
50 55 60 


Ile Glu Asp Leu Leu Phe Asp Lys Val Val Thr Ser Gly Leu Gly Th 
65 70 75 80 


val 


Leu 


Ala 


(2) 


Gln 


val 


Leu 


Phe 


Ala 
65 


(2) 


Leu 


Lys 


ser 


Ala 


Asn 


65 


val 


Thr 


Thr 


val 


Trp 
145 
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Asp Glu Asp Tyr Lys Arg Cys Thr Gly Gly Tyr Asp 
85 90 


Val Cys Ala Gln Tyr Tyr Asn Gly Ile Met Val Leu 
100 105 


Asn Asp Asp Lys Met Ala Met Tyr Thr Ala Ser Leu 
115 120 125 
INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 76 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 


Val Asp Arg Leu Ile Thr Gly Arg Leu Thr Ala Leu 
5 10 


Ser Gln Thr Leu Thr Arg Gln Ala Glu Val Arg Ala 
20 25 


Ala Lys Asp Lys Val Asn Glu Cys Val Arg Ser Gln 
35 40 45 


Gly Phe Cys Gly Asn Gly Thr His Leu Phe Ser Leu 
50 55 60 


Pro Asn Gly Met Ile Phe Phe His Thr Val Leu 
70 75 
INFORMATION FOR SEQ ID NO: 11: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 203 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 


Val Val Lys Asp Val Gln Leu Thr Leu Phe Arg Asn 
5 10 


Phe Tyr Leu Thr Pro Arg Thr Met Tyr Gln Pro Arg 
20 25 


Ser Asp Phe Val Gln Ile Glu Gly Cys Asp Val Leu 
35 40 45 


Thr Val Ile Asp Leu Pro Ser Ile Ile Pro Asp Tyr 
50 55 60 


Gln Thr Val Gln Asp Ile Leu Glu Asn Phe Arg Pro 
70 75 


Pro Glu Leu Pro Leu Asp Ile Phe Asn Ala Thr Tyr 
85 90 


Gly Glu Ile Asn Asp Leu Glu Phe Arg Ser Glu Lys 
100 105 


Thr Val Glu Leu Ala Ile Leu Ile Asp Asn Ile Asn 
115 120 125 


Asn Leu Glu Trp Leu Asn Arg Ile Glu Thr Tyr Val 
130 135 140 


Tyr Val Trp Leu Leu Ile Gly Leu Val Val Ile Phe 
150 155 


Ile Ala As 


95 


Pro Gly Va 


110 


Ala 


Asn 
Ser 
30 


Ser 


Ala 


Leu 


Val 


30 


Phe 


Ile 


Asn 


Leu 


Leu 


110 


Asn 


Lys 


Cys 


Ala 
15 
Arg 


Gln 


Asn 


Asp 


15 


Ala 


Val 


Asp 


Trp 


Asn 


95 


His 


Thr 


Trp 


Ile 


Ph 


Gl 


Ar 


Al 


As 


Th 


As 


Lh 


Th 


80 


Le 


As 


Le 


Pr 


Pr 
160 


40 
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Ile Leu Leu Phe Cys Cys Cys Ser Thr Gly Cys Cys Gly Cys Ile Gl 
165 170 175 


Cys Leu Gly Ser Cys Cys His Ser Ile Cys Ser Arg Arg Arg Phe Gl 
180 185 190 


Ser Tyr Glu Pro Ile Glu Lys Val His Val His 
195 200 
(2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 8 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
Asp Phe Leu Phe His Thr Phe Lys 
1 5 
(2) INFORMATION FOR SEQ ID NO: 13: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 19 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 


Trp Tyr Asn Cys Ser Arg Ser Ala Thr Thr Thr Ala Tyr Lys Asp Phe 
1 5 10 15 


Ser Asn Ile 


(2) INFORMATION FOR SEQ ID NO: 14: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 5 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
Tyr Val Thr Ala Tyr 
1 5 
(2) INFORMATION FOR SEQ ID NO: 15: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 34 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 


Asn Asn Thr Asn Gly Leu Lys Ser Tyr Glu Leu Cys Glu Asp Tyr Glu 
1 5 10 15 


Cys Cys Thr Gly Tyr Ala Thr Asn Val Phe Ala Pro Thr Val Gly Gly 
20 25 30 


Tyr Ile 


(2) INFORMATION FOR SEQ ID NO: 16: 


Ser 


(2) 


Gly 


Ala 


Ile 


(2) 


Ser 


Val 


(2) 


Ile 


(2) 


Lys 
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(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
Leu Asn Asn Thr Val Asp 
5 
INFORMATION FOR SEQ ID NO: 17: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 34 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 


Val Thr Asp Gly Pro Arg Tyr Cys Tyr Ala Leu Tyr Asn Gly Thr 
5 10 15 


Leu Lys Tyr Leu Gly Thr Leu Pro Pro Ser Val Lys Glu Ile Ala 
20 25 30 


Ser 


INFORMATION FOR SEQ ID NO: 18: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 27 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 


Tyr Thr Asp Ala Leu Val Gln Val Glu Asn Thr Ala Ile Lys Lys 
5 10 15 


Thr Tyr Cys Asn Ser His Ile Asn Asn Ile 
20 25 
INFORMATION FOR SEQ ID NO: 19: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 15 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
Ser Val Gln Val Glu Tyr Ile Gln Val Tyr Thr Thr Pro Val 
5 10 15 
INFORMATION FOR SEQ ID NO: 20: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 37 amino acids 
(B) TYPE: amino acid 


(D) TOPOLOGY: unknown 


(ii) MOLECULE TYPE: protein 





(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 


Leu Ala Ser Val Glu Ala Phe Asn Ser Thr Glu Thr Leu Asp Pro 
5 10 15 


Ile 


Lys 


(2) 


Leu 


(2) 


Ala 


Lys 


Ala 


Ala 


Ser 
65 


(2) 


Leu 


Trp 


(2) 
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Tyr Lys Glu Trp Pro Asn Ile Gly Gly Ser Trp Leu Gly Gly Leu 
20 25 30 


Asp Ile Leu Pro 
35 
INFORMATION FOR SEQ ID NO: 21: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 16 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
Gly Thr Val Asp Glu Asp Tyr Lys Arg Cys Thr Gly Gly Tyr Asp 
ies 10 15 
INFORMATION FOR SEQ ID NO: 22: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 78 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 


Asn Ala Phe Asn Gln Ala Ile Gly Asn Ile Thr Gln Ala Phe Gly 
5 10 15 


Val Asn Asp Ala Ile His Gln Thr Ser Gln Gly Leu Ala Thr Val 
20 25 30 


Lys Ala Leu Ala Lys Val Gln Asp Val Val Asn Thr Gln Gly Gln 
35 40 45 


Leu Ser His Leu Thr Val Gln Leu Gln Asn Asn Phe Gln Ala Ile 
50 55 60 


Ser Ser Ile Ser Asp Ile Tyr Asn Arg Leu Asp Glu Leu 
70 75 
INFORMATION FOR SEQ ID NO: 23: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 


Ala Ile Leu Ile Asp Asn Ile Asn Asn Thr Leu Val Asn Leu Glu 
5 10 15 


Leu Asn Arg Ile Glu Thr Tyr Val Lys 
20 25 
INFORMATION FOR SEQ ID NO: 24: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 372 base pairs 
(B) TYPE: nucleic acid 
(C) STRANDEDNESS: double 
(D) TOPOLOGY: unknown 


(ii) MOLECULE TYPE: DNA (genomic) 


(ix) FEATURE: 


CAA 
Gln 


CAA 
Gln 


TTG 
Leu 


GCA 
Ala 


AGG 
Arg 
65 


TCT 


Ser 


TCA 
Ser 


CTA 
Leu 


(2) 


Gln 


Gln 


Leu 


Ala 


Arg 


65 


ser 


ser 


Leu 


(2) 
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(A) NAME/KEY: CDS 
(B) LOCATION: 1..372 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 


GGG CAA GCT TTA AGC CAC CTA ACA GTA CAA TTG 
Gly Gln Ala Leu Ser His Leu Thr Val Gln Leu 
5 10 


GCC ATT AGT AGT TCC ATT AGT GAC ATT TAT AAC 
Ala Ile Ser Ser Ser Ile Ser Asp Ile Tyr Asn 
20 25 


AGT GCT GAT GCA CAA GTT GAC AGG CTG ATT ACA 
Ser Ala Asp Ala Gln Val Asp Arg Leu Ile Thr 
35 40 


CTT AAT GCA TTT GTG TCT CAG ACT TTA ACC AGA 
Leu Asn Ala Phe Val Ser Gln Thr Leu Thr Arg 
50 55 60 


GCT AGC AGA CAG CTT GCT AAA GAC AAG GTA AAT 
Ala Ser Arg Gln Leu Ala Lys Asp Lys Val Asn 
70 75 


CAA TCT CAG AGA TTT GGA TTC TGT GGT AAT GGT 
Gln Ser Gln Arg Phe Gly Phe Cys Gly Asn Gly 
85 90 


CTT GCA AAT GCA GCA CCA AAT GGC ATG ATC TTC 
Leu Ala Asn Ala Ala Pro Asn Gly Met Ile Phe 
100 105 


TTA CCA ACA GCT TAT GAA ACC GTG ACG GCC TGG 
Leu Pro Thr Ala Tyr Glu Thr Val Thr Ala Trp 
115 120 
INFORMATION FOR SEQ ID NO: 25: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 124 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 


Gly Gln Ala Leu Ser His Leu Thr Val Gln Leu 
5 10 


Ala Ile Ser Ser Ser Ile Ser Asp Ile Tyr Asn 
20 25 


Ser Ala Asp Ala Gln Val Asp Arg Leu Ile Thr 
35 40 


Leu Asn Ala Phe Val Ser Gln Thr Leu Thr Arg 
50 55 60 


Ala Ser Arg Gln Leu Ala Lys Asp Lys Val Asn 
70 75 


Gln Ser Gln Arg Phe Gly Phe Cys Gly Asn Gly 
85 90 


Leu Ala Asn Ala Ala Pro Asn Gly Met Ile Phe 
100 105 


Leu Pro Thr Ala Tyr Glu Thr Val Thr Ala Trp 
115 120 
INFORMATION FOR SEQ ID NO: 26: 
(i) SEQUENCE CHARACTERISTICS: 


(A) LENGTH: 180 base pairs 
(B) TYPE: nucleic acid 


CAA 
Gln 


AGG 
Arg 


GGA 
Gly 
45 


CAA 
Gln 


GAA 
Glu 


ACA 
Thr 


bal 
Phe 


Gln 


Arg 


Gly 


45 


Gln 


Glu 


Thr 


Phe 


AAT 
Asn 


cTT 
Leu 

30 
AGA 
Arg 


GCA 
Ala 


TGC 
Cys 


CAT 
His 


CAC 
His 
110 


Asn 


Leu 


30 


Arg 


Ala 


Cys 


His 


His 
110 


AAT 
Asn 

15 
GAT 
Asp 


cTT 
Leu 


GAG 
Glu 


GTT 
Val 


TTA 
Leu 
95 


ACA 
Thr 


Asn 


15 


Asp 


Leu 


Glu 


Val 


Leu 


95 


Thr 


TTC 
Phe 


Glu 


ACA 
Thr 


GTT 
Val 


AGG 
Arg 
80 


TTT. 
Phe 


GTG 
val 


Phe 


Glu 


Thr 


Val 


Arg 


80 


Phe 


Val 


48 


96 


144 


192 


240 


288 


336 


372 


48 


cTT 
Leu 


AGT 
Ser 


ATT 
Ile 


TcT 
Ser 


(2) 


Leu 


Ser 


Ile 


Ser 


(2) 


GTc 
val 


GGT 
Gly 


ATT 
Ile 
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(C) STRANDEDNESS: double 
(D) TOPOLOGY: 


unknown 


(ii) MOLECULE TYPE: DNA (genomic) 


(ix) FEATURE: 
(A) NAME/KEY: CDS 
(B) LOCATION: 


1..180 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 


GGT 
Gly 


AAC 
Asn 


CGT 
Arg 


TTA 
Leu 
50 


ATG 
Met 


ATC 
Ile 


TcT 
Ser 
35 


TGG 
Trp 


AAG CGT 
Lys Arg 
5 


ACA CTA 
Thr Leu 
20 


AAC CAA 
Asn Gln 


GAC GAT 
Asp Asp 


AGT 
Ser 


CCA 
Pro 


TTT 
Phe 


GTG 
val 


GGT TAT GGT CAA CCC ATA GCC TCA ACA TTA 
Gly Tyr Gly Gln Pro Ile Ala Ser Thr Leu 
10 15 


ATG CAG GAT AAT AAC ACC GAT GTG TAC TGC 
Met Gln Asp Asn Asn Thr Asp Val Tyr Cys 
25 30 


TCA GTT TAC GTT CAT TCC ACT TGT AAA AGT 
Ser Val Tyr Val His Ser Thr Cys Lys Ser 
40 45 


TTT AAT TCC GAC TGC ACA 
Phe Asn Ser Asp Cys Thr 
55 60 


INFORMATION FOR SEQ ID NO: 27: 


(i) SEQUENCE CHARACTERISTICS: 


(A) LENGTH: 


(B) TYPE: 
(D) TOPOLOGY: 


60 amino acids 


amino acid 


linear 


(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 


Gly Met Lys Arg Ser Gly Tyr Gly Gln Pro Ile Ala Ser Thr Leu 


Asn 


Arg 


Leu 
50 


INFORMATION 


Ile 


Ser 
35 


Trp 


5 


10 15 


Thr Leu Pro Met Gln Asp Asn Asn Thr Asp Val Tyr Cys 


20 


25 30 


Asn Gln Phe Ser Val Tyr Val His Ser Thr Cys Lys Ser 


40 45 


Asp Asp Val Phe Asn Ser Asp Cys Thr 


55 60 


FOR SEQ ID NO: 28: 


(i) SEQUENCE CHARACTERISTICS: 


(A) LENGTH: 


(B) TYPE: 


141 base pairs 


nucleic acid 
(C) STRANDEDNESS: double 
(D) TOPOLOGY: 


unknown 


(ii) MOLECULE TYPE: DNA (genomic) 


(ix) FEATURE: 
(A) NAME/KEY: CDS 
(B) LOCATION: 


1..141 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 


ATT 
Ile 


GcT 
Ala 


TcT 
Ser 


AGA 
Arg 


ACA 
Thr 


TGT 
Cys 
35 


TTC AAC 
Phe Asn 
5 


GTA TIT 
Val Phe 
20 


TAT AAT 
Tyr Asn 


cTT 
Leu 


TCA 
Ser 


GAT 
Asp 


AAT TTT ACC ACA GAT GTA CAA TCT GGT ATG 
Asn Phe Thr Thr Asp Val Gln Ser Gly Met 
10 15 


CTG AAT ACA ACA GGT GGT GTC ATT CTT GAG 
Leu Asn Thr Thr Gly Gly Val Ile Leu Glu 
25 30 


ACA GTG AGT GAG TCA AGT TTC TAC AGT 
Thr Val Ser Glu Ser Ser Phe Tyr Ser 
40 45 


48 


96 


144 


180 


48 


96 


141 


50 
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(2) INFORMATION FOR SEQ ID NO: 29: 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 47 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 


(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 


Val Ile Arg Phe Asn Leu Asn Phe Thr Thr Asp Val Gln Ser Gly Met 
1 5 10 15 


Gly Ala Thr Val Phe Ser Leu Asn Thr Thr Gly Gly Val Ile Leu Glu 
20 25 30 


Ile Ser Cys Tyr Asn Asp Thr Val Ser Glu Ser Ser Phe Tyr Ser 
35 40 45 


(2) INFORMATION FOR SEQ ID NO: 30: 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 51 base pairs 
(B) TYPE: nucleic acid 
(C) STRANDEDNESS: double 
(D) TOPOLOGY: unknown 


(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..51 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 


TGT ATA ACT AAA AAT AAA ATC ATT GAC TAT AAC ACG TTT ACC AGC GCA 48 
Cys Ile Thr Lys Asn Lys Ile Ile Asp Tyr Asn Thr Phe Thr Ser Ala 

1 5 10 15 
CAG 51 
Gln 


(2) INFORMATION FOR SEQ ID NO: 31: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 17 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 


Cys Ile Thr Lys Asn Lys Ile Ile Asp Tyr Asn Thr Phe Thr Ser Ala 
1 5 10 15 


Gln 


(2) INFORMATION FOR SEQ ID NO: 32: 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 42 base pairs 
(B) TYPE: nucleic acid 
(C) STRANDEDNESS: double 
(D) TOPOLOGY: unknown 


(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..42 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 


TcT 
Ser 


(2) 


Ser 


(2) 


ATT 
Ile 


TTT 
Phe 


(2) 


Ile 


Phe 


(2) 
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TGT TAT AAT GAT ACA GTG AGT GAG TCA AGT TTC TAC AGT 
Cys Tyr Asn Asp Thr Val Ser Glu Ser Ser Phe Tyr Ser 
5 10 


INFORMATION FOR SEQ ID NO: 33: 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 14 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 


(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
Cys Tyr Asn Asp Thr Val Ser Glu Ser Ser Phe Tyr Ser 
5 10 
INFORMATION FOR SEQ ID NO: 34: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 51 base pairs 
(B) TYPE: nucleic acid 
(C) STRANDEDNESS: double 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 
(A) NAME/KEY: CDS 
(B) LOCATION: 1..51 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
GGG TGT TTA GGA AGC TGT TGT CAT TCC ATA TGT AGT AGA AGG CGA 


Gly Cys Leu Gly Ser Cys Cys His Ser Ile Cys Ser Arg Arg Arg 
5 10 15 


INFORMATION FOR SEQ ID NO: 35: 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 17 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 


(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 


Gly Cys Leu Gly Ser Cys Cys His Ser Ile Cys Ser Arg Arg Arg 
5 10 15 


INFORMATION FOR SEQ ID NO: 36: 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 42 base pairs 
(B) TYPE: nucleic acid 
(C) STRANDEDNESS: double 
(D) TOPOLOGY: unknown 


(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..42 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 


TGC ATA CCC ATA TTG CTA TTT TGT TGT TGT AGC ACT GGT TGT 


54 


42 


48 


51 


42 


cys 


(2) 


Cys 


(2) 


GTA 
val 


TIC 
Phe 
65 


(2) 


Tyr 


Lys 


Asn 


val 
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Ile Pro Ile Leu Leu Phe Cys Cys Cys Ser Thr Gly Cys 
5 10 


INFORMATION FOR SEQ ID NO: 37: 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 14 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 


(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 


Ile Pro Ile Leu Leu Phe Cys Cys Cys Ser Thr Gly Cys 
5 10 


INFORMATION FOR SEQ ID NO: 38: 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 195 base pairs 
(B) TYPE: nucleic acid 
(C) STRANDEDNESS: double 
(D) TOPOLOGY: unknown 


(ii) MOLECULE TYPE: DNA (genomic) 


(ix) FEATURE: 
(A) NAME/KEY: CDS 
(B) LOCATION: 1..195 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 


TTA AAC CTG ACT GGT GAA ATT AAT GAC TTA GAA TTT AGG TCA GAA 
Leu Asn Leu Thr Gly Glu Ile Asn Asp Leu Glu Phe Arg Ser Glu 
5 10 15 


TTA CAT AAC ACC ACA GTA GAA CTT GCT ATT CTC ATT GAT AAT ATT 
Leu His Asn Thr Thr Val Glu Leu Ala Ile Leu Ile Asp Asn Ile 
20 25 30 


AAC ACA TTA GTC AAT CTT GAA TGG CTC AAT AGA ATT GAA ACT TAT 
Asn Thr Leu Val Asn Leu Glu Trp Leu Asn Arg Ile Glu Thr Tyr 
35 40 45 


AAA TGG CCT TGG TAT GTG TGG CTA CTA ATT GGA TTA GTA GTA ATA 
Lys Trp Pro Trp Tyr Val Trp Leu Leu Ile Gly Leu Val Val Ile 


INFORMATION FOR SEQ ID NO: 39: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 65 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 


Leu Asn Leu Thr Gly Glu Ile Asn Asp Leu Glu Phe Arg Ser Glu 
5 10 15 


Leu His Asn Thr Thr Val Glu Leu Ala Ile Leu Ile Asp Asn Ile 
20 25 30 


Asn Thr Leu Val Asn Leu Glu Trp Leu Asn Arg Ile Glu Thr Tyr 
35 40 45 


Lys Trp Pro Trp Tyr Val Trp Leu Leu Ile Gly Leu Val Val Ile 
50 55 60 


48 


96 


144 


192 


195 


56 


Phe 
65 


(2) 


GAT 
Asp 


TAT 
Tyr 


TGG 
Trp 


ATT 
Tle 


ATT 
Ile 


GTT 
Val 


ccT 
Pro 


ATG 
Met 
45 


ATC 
le 


TCT 





GcT 
Ala 


AAT 
Asn 
225 
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INFORMATION FOR SEQ ID NO: 40: 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 765 base pairs 
(B) TYPE: nucleic acid 
(C) STRANDEDNESS: double 
(D) TOPOLOGY: unknown 


(ii) MOLECULE TYPE: DNA (genomic) 


(ix) FEATURE: 
(A) NAME/KEY: CDS 
(B) LOCATION: 1..765 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 


GGA CCG CGT TAC TGT TAC GCA CTC TAT AAT GGC ACG GCT CTT 
Gly Pro Arg Tyr Cys Tyr Ala Leu Tyr Asn Gly Thr Ala Leu 
5 10 15 


TTA GGA ACA TTA CCA CCT AGT GTC AAG GAA ATT GCT ATT AGT 
Leu Gly Thr Leu Pro Pro Ser Val Lys Glu Ile Ala Ile Ser 
20 25 30 


GGC CAT TTT TAT ATT AAT GGT TAC AAT TTC TTT AGC ACT TIT 
Gly His Phe Tyr Ile Asn Gly Tyr Asn Phe Phe Ser Thr Phe 
35 40 45 


GAT TGT ATA TCT TTT AAT TTA ACC ACT GGT GAT AGT GGA GCA 
Asp Cys Ile Ser Phe Asn Leu Thr Thr Gly Asp Ser Gly Ala 
50 55 60 


ACA ATT GCT TAC ACA TCG TAC ACT GAC GCA TTA GTA CAA GTT 
Thr Ile Ala Tyr Thr Ser Tyr Thr Asp Ala Leu Val Gln Val 
70 15 


ACA GCT ATT AAA AAG GTG ACG TAT TGT AAC AGT CAC ATT AAT 
Thr Ala Ile Lys Lys Val Thr Tyr Cys Asn Ser His Ile Asn 
85 90 35 


AAA TGT TCT CAA CTT ACT GCT AAT TTG CAA AAT GGA TTT TAT 
Lys Cys Ser Gln Leu Thr Ala Asn Leu Gln Asn Gly Phe Tyr 
100 105 110 


GCT TCA AGT GAA GTT GGT CTT GTC AAT AAG AGT GTT GTG TTA 
Ala Ser Ser Glu Val Gly Leu Val Asn Lys Ser Val Val Leu 
115 120 125 


AGT TTC TAT TCA CAT ACC AGT GTT AAT ATA ACT ATT GAT CTT 
Ser Phe Tyr Ser His Thr Ser Val Asn Ile Thr Ile Asp Leu 
130 135 140 


AAG CGT AGT GGT TAT GGT CAA CCC ATA GCC TCA ACA TTA AGT 
Lys Arg Ser Gly Tyr Gly Gln Pro Ile Ala Ser Thr Leu Ser 
150 155 


ACA CTA CCA ATG CAG GAT AAT AAC ACC GAT GTG TAC TGC ATT 
Thr Leu Pro Met Gln Asp Asn Asn Thr Asp Val Tyr Cys Ile 
165 170 175 


AAC CAA TTT TCA GTT TAC GTT CAT TCC ACT TGT AAA AGT TCT 
Asn Gln Phe Ser Val Tyr Val His Ser Thr Cys Lys Ser Ser 
180 185 190 


GAC GAT GTG TTT AAT TCC GAC TGC ACA GAT GTT TTA TAT GCT 
Asp Asp Val Phe Asn Ser Asp Cys Thr Asp Val Leu Tyr Ala 
195 200 205 


GTT ATA AAA ACT GGT ACT TGT CCT TTC TCG TTT GAT AAA TTG 
Val Ile Lys Thr Gly Thr Cys Pro Phe Ser Phe Asp Lys Leu 
210 215 220 


TAC TTA ACT TTT AAC AAG TTC TGT TTG TCA TTG AAT CCT GTT 
Tyr Leu Thr Phe Asn Lys Phe Cys Leu Ser Leu Asn Pro Val 
230 235 


AAG 
Lys 


AAG 
Lys 


ccT 
Pro 


TIT 
Phe 


GAA 
Glu 
80 


AAC 
Asn 


ccT 
Pro 


CTA 
Leu 


GGT 
Gly 


AAC 
Asn 
160 


CGT 
Arg 


TTA 
Leu 


ACA 
Thr 


AAC 
Asn 


GGT 
Gly 
240 


48 


96 


144 


192 


240 


288 


336 


384 


432 


480 


528 


576 


624 


672 


720 


58 


59 
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-continued 


GCC AAC TGC AAG TTT GAT GTT GCC GCT CGT ACA AGA ACC AAT GAG 
Ala Asn Cys Lys Phe Asp Val Ala Ala Arg Thr Arg Thr Asn Glu 


245 


(2) INFORMATION FOR SEQ ID NO: 


Asp 


Tyr 


Trp 


Ile 


Trp 


Asn 


Ile 


Val 


Pro 


Met 


145 


Ile 


ser 


Trp 


Ala 


Asn 


225 


Ala 


(2) 


41: 


(i) SEQUENCE CHARACTERISTICS: 


(A) LENGTH: 


(B) TYPE: 
(D) TOPOLOGY: 


amino acid 
linear 


(ii) MOLECULE TYPE: protein 


250 


255 amino acids 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 


Gly 


Leu 


Gly 


Asp 


Thr 


Thr 


Lys 


Ala 


Ser 


130 


Lys 


Thr 


Asn 


Asp 


Val 


210 


Tyr 


Asn 


Pro 


Gly 


His 


35 


Cys 


Ile 


Ala 


Cys 


Ser 


115 


Phe 


Arg 


Leu 


Gln 


Asp 


195 


Ile 


Leu 


Cys 


Arg 


Thr 


20 


Phe 


Tle 


Ala 


Ile 


Ser 


100 


Ser 


Tyr 


Ser 


Pro 


Phe 


180 


val 


Lys 


Thr 


Lys 


Tyr 


5 


Leu 


Tyr 


Ser 


Tyr 


Lys 


85 


Gln 


Glu 


Ser 


Gly 


Met 


165 


ser 


Phe 


Thr 


Phe 


Phe 
245 


cys 


Pro 


Ile 


Phe 


Thr 


70 


Lys 


Leu 


val 


His 


Tyr 


150 


Gln 


val 


Asn 


Gly 


Asn 


230 


Asp 


Tyr 


Pro 


Asn 


Asn 


55 


Ser 


val 


Thr 


Gly 


Thr 


135 


Gly 


Asp 


Tyr 


ser 


Thr 


215 


Lys 


val 


Ala Leu Tyr 


Ser 


Gly 


40 


Leu 


Tyr 


Thr 


Ala 


Leu 


120 


Ser 


Gln 


Asn 


Val 


Asp 


200 


Cys 


Phe 


Ala 


INFORMATION FOR SEQ ID NO: 


Val 


25 


Tyr 


Thr 


Thr 


Tyr 


Asn 


105 


val 


val 


Pro 


Asn 


His 


185 


Cys 


Pro 


Cys 


Ala 


42: 


(i) SEQUENCE CHARACTERISTICS: 


(A) LENGTH: 


(B) TYPE: 
(C) STRANDEDNESS: 
(D) TOPOLOGY: 


nucleic acid 
double 
unknown 


10 


Lys 


Asn 


Thr 


Asp 


Cys 


90 


Leu 


Asn 


Asn 


Ile 


Thr 


170 


Ser 


Thr 


Phe 


Leu 


Arg 
250 


1284 base pairs 


(ii) MOLECULE TYPE: DNA (genomic) 


(ix) FEATURE: 
(A) NAME/KEY: CDS 
(B) LOCATION: 


1..1284 


Asn 


Glu 


Phe 


Gly 


Ala 


75 


Asn 


Gln 


Lys 


Ile 


Ala 


155 


Asp 


Thr 


Asp 


Ser 


Ser 


235 


Thr 


Gly 


Ile 


Phe 


Asp 


60 


Leu 


Ser 


Asn 


Ser 


Thr 


140 


Ser 


val 


Cys 


Val 


Phe 


220 


Leu 


Arg 


Thr 


Ala 


Ser 


45 


Ser 


val 


His 


Gly 


val 


125 


Ile 


Thr 


Tyr 


Lys 


Leu 


205 


Asp 


Asn 


Thr 


Ala 


Ile 


30 


Thr 


Gly 


Gln 


Tle 


Phe 


110 


val 


Asp 


Leu 


Cys 


Ser 


190 


Tyr 


Lys 


Pro 


Asn 


255 


Leu 


15 


Ser 


Phe 


Ala 


val 


Asn 


95 


Tyr 


Leu 


Leu 


Ser 


Ile 


175 


Ser 


Ala 


Leu 


Val 


Glu 
255 


Lys 


Lys 


Pro 


Phe 


Glu 


80 


Asn 


Pro 


Leu 


Gly 


Asn 


160 


Arg 


Leu 


Thr 


Asn 


Gly 
240 


765 


60 


AGG 
Arg 


ATT 
Ile 


GGT 
Gly 


ACA 
Thr 


ATT 
Ile 
65 


Asn 


AGT 
Ser 


TTA 
Leu 


GAA 
Glu 


GGT 
Gly 
145 


Asn 


TTG 
Leu 


GAA 
Glu 


TTA 
Leu 


GAT 
Asp 
225 


GGT 
Gly 


TCA 
ser 


CCG 
Pro 


GGA 
Gly 


61 


US 6,372,224 B1 


-continued 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 


ccT 
Pro 


GAC 
Asp 


GAT 
Asp 


Lys 
50 


AGT 
Ser 


GTG 
val 


GcT 
Ala 


AAT 
Asn 


TGC 
Cys 
130 


TAT 
Tyr 


AGT 
Ser 


GTT 
val 


TTT 
Phe 


AAC 
Asn 
210 


GTA 
Va 


ect 
Gly 


AGT 
Ser 





cet 
Arg 


ACA 
Thr 
290 


cTT 
Leu 


TAT 
Tyr 


GAC 
Asp 
35 


ATA 
Ile 


GAT 
Asp 


ACA 
Thr 


GCA 
Ala 


AAC 
Asn 
115 


TGC 
Cys 


ATA 
Ile 


TCC 
Ser 


AAT 
Asn 


TGT 
Cys 
195 


AAT 
Asn 


CAA 
Gln 


GTC 
val 


TTC 
Phe 


TAC 
Tyr 
275 


TTA 
Leu 


TTA 
Leu 


AAC 
Asn 
20 


AGA 
Arg 


TTT 
Phe 


CGT 
Arg 


ATC 
Ile 


TAT 
Tyr 
100 


ACC 
Thr 


ACT 
Thr 


ccT 
Pro 


ACG 
Thr 


TGT 
Cys 
180 


TTT 
Phe 


ACA 
Thr 


TcT 
Ser 


ATT 
Ile 


TAC 
Tyr 
260 


TGT 
Cys 


CCA 
Pro 


AAA 
Lys 
5 


ACG 
Thr 


Lys 


GGT 
Gly 


TcT 
Ser 


cTA 
Leu 
85 


GTT 
val 


AAT 
Asn 


GGC 
Gly 


GAT 
Asp 


TTT 
Phe 
165 


TTG 
Leu 


GAA 
Glu 


GTG 
val 


GGT 
Gly 


cTT 
Leu 
245 


AGT 


Ser 


TAC 
Tyr 


ccT 
Pro 


CAT 
His 


TTT 
Phe 


ATA 
Ile 


cTT 
Leu 


CAC 
His 
70 


TAC 
Tyr 


TAT 
Tyr 


GGC 
Gly 


TAT 
Tyr 


GGC 
Gly 
150 


GTT 
val 


TGG 
Trp 


GGT 
Gly 


GAT 
Asp 


ATG 
Met 
230 


GAG 
Glu 


TAT 
Tyr 


GCA 
Ala 


AGT 
Ser 


GGT 
Gly 


Acc 
Thr 


CCA 
Pro 


GAG 
Glu 
55 


CAT 
His 


TcT 
Ser 


CAA 
Gln 


TTG 
Leu 


GcT 
Ala 
135 


TTC 
Phe 


AGT 
Ser 


CCA 
Pro 


GCG 
Ala 


GTC 
val 
215 


GGT 
Gly 


ATT 
Ile 


GGT 
Gly 


cTc 
Leu 


GTc 
val 
295 


TTG 
Leu 


AGC 
Ser 


TTC 
Phe 
40 


TGG 
Trp 


TTG 
Leu 


CGA 
Arg 


GGT 
Gly 


Lys 
120 


ACC 
Thr 


AGT 
Ser 


GGC 
Gly 


GTG 
val 


CAG 
Gln 
200 


ATT 
Ile 


GcT 
Ala 


TcT 
Ser 


GAA 
Glu 


TAT 
Tyr 
280 


AAG 
Lys 


TTG 
Leu 


GCA 
Ala 
25 


TcT 
Ser 


AAT 
Asn 


AAC 
Asn 


TCA 
Ser 


GTT 
val 
105 


AGC 
Ser 


Asn 


TET 
Phe 


AGA 
Arg 


ccc 
Pro 
185 


TTT 
Phe 


AGA 
Arg 


ACA 
Thr 


TGT 
Cys 


ATT 
Ile 
265 


AAT 


Asn 


GAA 
Glu 


TGT 
Cys 
10 


CAG 
Gln 


GTC 
Val 


GAT 
Asp 


ATC 
Ile 


AGC 
Ser 
90 


TCA 
Ser 


TAT 
Tyr 


GTA 
val 


AAC 
Asn 


A bal 
Phe 
170 


AGT 
Ser 


AGC 
Ser 


TTC 
Phe 


GTA 
Val 


TAT 
Tyr 
250 


TCA 


Ser 


GGC 
Gly 


ATT 
Ile 


ATA 
Ile 


TGG 
Trp 


ATA 
Ile 


GAC 
Asp 


AAT 
Asn 
75 


AcT 
Thr 


AAT 
Asn 


GAA 
Glu 


TTT 
Phe 


AAT 
Asn 
155 


GTA 
Val 


cTT 
Leu 


CAA 
Gln 


AAC 
Asn 


TTT 
Phe 
235 


AAT 


Asn 


TTC 
Phe 


ACG 
Thr 


GcT 
Ala 


ACT 
Thr 


AGT 
Ser 


ccc 
Pro 


TAT 
Tyr 
60 


AAT 
Asn 


GcT 
Ala 


TTT 
Phe 


TTG 
Leu 


GCC 
Ala 
140 


TGG 
Trp 


ACA 
Thr 


GGT 
Gly 


TGT 
Cys 


cTT 
Leu 
220 


TCA 
Ser 


GAT 
Asp 


GGC 
Gly 


GcT 
Ala 


ATT 
Ile 
300 


Lys 


G6cc 
Ala 


ACA 
Thr 
45 


GTT 
Val 


AAT 
Asn 


ACG 
Thr 


AcT 
Thr 


TGT 
Cys 
125 


CCG 
Pro 


TTT 
Phe 


AAT 
Asn 


GTC 
val 


AAT 
Asn 
205 


AAT 
Asn 


CcTG 
Leu 


ACA 
Thr 


GTA 
Val 


cTT 
Leu 
285 


AGT 
Ser 


AAT 
Asn 


ATA 
Ile 
30 


Gly 


ACA 
Thr 


TGG 
Trp 


TGG 
Trp 


TAT 
Tyr 
110 


GAA 
Glu 


ACA 
Thr 


ATG 
Met 


CAA 
Gln 


GCA 
Ala 
190 


GGT 
Gly 


TTT 
Phe 


AAT 
Asn 


GTG 
Val 


ACT 
Thr 
270 


AAG 
Lys 


AAG 
Lys 


Lys 
15) 


TGT 
Cys 


AAT 
Asn 


Ala 


TTT 
Phe 


CAG 
Gln 
95 


TAC 
Tyr 


GAT 
Asp 


GTG 
Val 


cTT 
Leu 


CCA 
Pro 
175 


GCA 
Ala 


GTG 
val 


ACC 
Thr 


ACA 
Thr 


AGT 
Ser 
255 


GAT 
Asp 


TAT 
Tyr 


TGG 
Trp 


ATC 
Ile 


TTG 
Leu 


Gly 


TAT 
Tyr 


AAC 
Asn 
80 


AAG 
Lys 


AAG 
Lys 


TAT 
Tyr 


GGC 
Gly 


ACA 
Thr 
160 


TTA 
Leu 


CAA 
Gln 


TCT. 
Ser 


ACA 
Thr 


ACA 
Thr 
240 


GAG 
Glu 


GGA 
Gly 


TTA 
Leu 


GGC 
Gly 


48 


96 


144 


192 


240 


288 


336 


384 


432 


480 


528 


576 


624 


672 


720 


768 


816 


864 


912 


62 


CAT 
His 
305 
TGT 


Cys 


ATT 
Ile 


ect 
Ala 





TST 
Cys 


TCA 
Ser 
385 


TIC 
Phe 


CGT 
Arg 


(2) 


Arg 


Ile 


Gly 


Thr 


Ile 


65 


Asn 


ser 


Leu 


Glu 


Gly 


145 


Asn 


Leu 


US 6,372,224 B1 


63 


-continued 


TTT TAT ATT AAT GGT TAC AAT TTC TIT AGC ACT 
Phe Tyr Ile Asn Gly Tyr Asn Phe Phe Ser Thr 
310 315 


ATA TCT TTT AAT TTA ACC ACT GGT GAT AGT GGA 
Ile Ser Phe Asn Leu Thr Thr Gly Asp Ser Gly 
325 330 


GCT TAC ACA TCG TAC ACT GAC GCA TTA GTA CAA 
Ala Tyr Thr Ser Tyr Thr Asp Ala Leu Val Gln 
340 345 





ATT AAA AAG GTG ACG TAT TGT AAC AGT CAC ATT 
Ile Lys Lys Val Thr Tyr Cys Asn Ser His Ile 
355 360 





TCT CAA CTT ACT GCT AAT TTG CAA AAT GGA TTT 
Ser Gln Leu Thr Ala Asn Leu Gln Asn Gly Phe 
370 375 380 


AGT GAA GTT GGT CTT GTC AAT AAG AGT GTT GTG 
Ser Glu Val Gly Leu Val Asn Lys Ser Val Val 
390 395 


TAT TCA CAT ACC AGT GTT AAT ATA ACT ATT GAT 
Tyr Ser His Thr Ser Val Asn Ile Thr Ile Asp 
405 410 


AGT GGT TAT GGT CAA CCC ATA GCC TCA ACA TTA 
Ser Gly Tyr Gly Gln Pro Ile Ala Ser Thr Leu 
420 425 
INFORMATION FOR SEQ ID NO: 43: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 428 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 


Pro Leu Leu Lys His Gly Leu Leu Cys Ile Thr 
5 10 


Asp Tyr Asn Thr Phe Thr Ser Ala Gln Trp Ser 
20 25 


Asp Asp Arg Lys Ile Pro Phe Ser Val Ile Pro 
35 40 


Lys Ile Phe Gly Leu Glu Trp Asn Asp Asp Tyr 
50 55 60 


Ser Asp Arg Ser His His Leu Asn Ile Asn Asn 
70 75 


Val Thr Ile Leu Tyr Ser Arg Ser Ser Thr Ala 
85 90 


Ala Ala Tyr Val Tyr Gln Gly Val Ser Asn Phe 
100 105 


Asn Asn Thr Asn Gly Leu Lys Ser Tyr Glu Leu 
115 120 


Cys Cys Thr Gly Tyr Ala Thr Asn Val Phe Ala 
130 135 140 


Tyr Ile Pro Asp Gly Phe Ser Phe Asn Asn Trp 
150 155 


Ser Ser Thr Phe Val Ser Gly Arg Phe Val Thr 
165 170 


Val Asn Cys Leu Trp Pro Val Pro Ser Leu Gly 
180 185 


TTT 
Phe 


GCA 
Ala 


GTT 
Val 


AAT 
Asn 
365 


TAT 
Tyr 


TTA 
Leu 


cTT 
Leu 


Lys 


Ala 


Thr 


45 


Val 


Asn 


Thr 


Thr 


Cys 


125 


Pro 


Phe 


Asn 


Val 


ccT 
Pro 


TTT 
Phe 


Glu 
350 


AAC 


Asn 


ccT 
Pro 


CTA 
Leu 


GGT 
Gly 


Asn 


Ile 


30 


Gly 


Thr 


Trp 


Trp 


Tyr 


110 


Glu 


Thr 


Met 


Gln 


Ala 
190 


ATT 
Ile 


TGG 
Trp 
335 


AAC 


Asn 


ATT 
Ile 


GTT 
Val 


ccT 
Pro 


ATG 
Met 
415 


Lys 


15 


Cys 


Asn 


Ala 


Phe 


Gln 


95 


Tyr 


Asp 


Val 


Leu 


Pro 


175 


Ala 


GAT 
Asp 
320 


ACA 
Thr 


ACA 
Thr 


Lys 


Ala 


AGT 
Ser 
400 


AAG 
Lys 


Ile 


Leu 


Gly 


LYE: 


Asn 


80 


Lys 


Lys 


Tyr 


Gly 


Thr 


160 


Leu 


Gln 


960 


1008 


1056 


1104 


1152 


1200 


1248 


1284 


64 


Glu 


Leu 


Asp 


225 


Gly 


Ser 


Pro 


Gly 


His 


305 


Cys 


Ile 


Ala 


Cys 


Ser 


385 


Phe 


Arg 


(2) 


GAT 
Asp 


ACA 
Thr 


ACA 
Thr 


Lys 


Phe 


Asn 


210 


val 


Gly 


Ser 


Arg 


Thr 


290 


Phe 


Ile 


Ala 


Ile 


Ser 


370 


Ser 


Tyr 


Ser 


Cys 


195 


Asn 


Gln 


val 


Phe 


Tyr 


275 


Leu 


Tyr 


Ser 


Tyr 


Lys 


355 


Gln 


Glu 


Ser 


Gly 


Phe 


Thr 


Ser 


Ile 


Tyr 


260 


Cys 


Pro 


Ile 


Phe 


Thr 


340 


Lys 


Leu 


val 


His 


Tyr 
420 


Glu 


val 


Gly 


Leu 


245 


Ser 


Tyr 


Pro 


Asn 


Asn 


325 


Ser 


val 


Thr 


Gly 


Thr 


405 


Gly 


Gly 


Asp 


Met 


230 


Glu 


Tyr 


Ala 


Ser 


Gly 


310 


Leu 


Tyr 


Thr 


Ala 


Leu 


390 


Ser 


Gln 


65 





Leu 


val 


295 


Tyr 


Thr 


Thr 


Tyr 


Asn 


375 


val 


val 


Pro 


Gln 


200 


Ile 


Ala 


Ser 


Glu 


Tyr 


280 


Lys 


Asn 


Thr 


Asp 


Cys 


360 


Leu 


Asn 


Asn 


Ile 


INFORMATION FOR SEQ ID NO: 


Phe 


Arg 


Thr 


Cys 


Ile 


265 


Asn 


Glu 


Phe 


Gly 


Ala 


345 


Asn 


Gln 


Lys 


Ile 


Ala 
425 


44: 


(i) SEQUENCE CHARACTERISTICS: 


Ser 


Phe 


Val 


Tyr 


250 


Ser 


Gly 


Ile 


Phe 


Asp 


330 


Leu 


Ser 


Asn 


Ser 


Thr 


410 


Ser 


(A) LENGTH: 546 base pairs 
(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
(D) TOPOLOGY: unknown 


(ii) MOLECULE TYPE: DNA (genomic) 


(ix) FEATURE: 
(A) NAME/KEY: CDS 
(B) LOCATION: 


1..546 


Gln 


Asn 


Phe 


235 


Asn 


Phe 


Thr 


Ala 


Ser 


315 


Ser 


Val 


His 


Gly 


Val 


395 


Ile 


Thr 
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-continued 


Cys 


Leu 


220 


Ser 


Asp 


Gly 


Ala 


Ile 


300 


Thr 


Gly 


Gln 


Ile 


Phe 


380 


val 


Asp 


Leu 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 


TST 
Cys 


ATT 
Ile 





TST 
Cys 
50 


ATA 
Ile 


GcT 


ATT 





35 


TcT 
Ser 


TcT 
Ser 


TAC 
Tyr 
20 


Lys 


CAA 
Gln 


TIT 
Phe 
5 


ACA 
Thr 


AAG 
Lys 


cTT 
Leu 


AAT 
Asn 


TCG 
Ser 


GTG 
val 


ACT 
Thr 


TTA 
Leu 


TAC 
EYE, 


ACG 
Thr 


GcT 
Ala 
55 


Acc 
Thr 


ACT 
Thr 


TAT 
Tyr 
40 


AAT 
Asn 


ACT 
Thr 


GAC 
Asp 
25 


TGT 
Cys 


TTG 
Leu 


GGT 
Gly 
10 


GCA 
Ala 


AAC 
Asn 


Gln 


GAT 
Asp 


TTA 
Leu 


AGT 
Ser 


AAT 
Asn 


AGT 
ser 


GTA 
val 


CAC 
His 


GGA 
Gly 
60 


Asn Gly Val 


205 


Asn 


Leu 


Thr 


Val 


Leu 


285 


Ser 


Phe 


Ala 


val 


Asn 


365 


Tyr 


Leu 


Leu 


GGA 
Gly 


CAA 
Gln 


ATT 
Ile 
45 


TTT. 
Phe 


Phe 


Asn 


Val 


Thr 


270 


Lys 


Lys 


Pro 


Phe 


Glu 


350 


Asn 


Pro 


Leu 


Gly 


GCA 
Ala 


GTT 
Val 
30 


AAT 


Asn 


TAT 
Tyr 


Thr 


Thr 


Ser 


255 


Asp 


Tyr 


Trp 


Ile 


Trp 


335 


Asn 


Ile 


val 


Pro 


Met 
415 


TTT 
Phe 
15 


Glu 


AAC 
Asn 


ccT 
Pro 


Ser 


Thr 


Thr 


240 


Glu 


Gly 


Leu 


Gly 


Asp 


320 


Thr 


Thr 


Lys 


Ala 


Ser 


400 


Lys 


TGG 
Trp 


AAC 
Asn 


ATT 
Ile 


GTT 
Val 


48 


96 


144 


192 


66 


US 6,372,224 B1 
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GCT TCA AGT GAA GTT GGT CTT GTC AAT AAG AGT GTT GTG TTA CTA CCT 240 
Ala Ser Ser Glu Val Gly Leu Val Asn Lys Ser Val Val Leu Leu Pro 
65 70 75 80 


AGT TTC TAT TCA CAT ACC AGT GTT AAT ATA ACT ATT GAT CTT GGT ATG 288 
Ser Phe Tyr Ser His Thr Ser Val Asn Ile Thr Ile Asp Leu Gly Met 
85 90 95 


AAG CGT AGT GGT TAT GGT CAA CCC ATA GCC TCA ACA TTA AGT AAC ATC 336 
Lys Arg Ser Gly Tyr Gly Gln Pro Ile Ala Ser Thr Leu Ser Asn Ile 
100 105 110 


ACA CTA CCA ATG CAG GAT AAT AAC ACC GAT GTG TAC TGC ATT CGT TCT 384 
Thr Leu Pro Met Gln Asp Asn Asn Thr Asp Val Tyr Cys Ile Arg Ser 
115 120 125 


AAC CAA TTT TCA GTT TAC GTT CAT TCC ACT TGT AAA AGT TCT TTA TGG 432 
Asn Gln Phe Ser Val Tyr Val His Ser Thr Cys Lys Ser Ser Leu Trp 
130 135 140 


GAC GAT GTG TTT AAT TCC GAC TGC ACA GAT GTT TTA TAT GCT ACA GCT 480 
Asp Asp Val Phe Asn Ser Asp Cys Thr Asp Val Leu Tyr Ala Thr Ala 
145 150 155 160 


GTT ATA AAA ACT GGT ACT TGT CCT TTC TCG TTT GAT AAA TTG AAC AAT 528 
Val Ile Lys Thr Gly Thr Cys Pro Phe Ser Phe Asp Lys Leu Asn Asn 
165 170 175 


TAC TTA ACT TTT AAC AAG 546 
Tyr Leu Thr Phe Asn Lys 
180 
(2) INFORMATION FOR SEQ ID NO: 45: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 182 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 


Asp Cys Ile Ser Phe Asn Leu Thr Thr Gly Asp Ser Gly Ala Phe Trp 
1 5 10 15 


Thr Ile Ala Tyr Thr Ser Tyr Thr Asp Ala Leu Val Gln Val Glu Asn 
20 25 30 


Thr Ala Ile Lys Lys Val Thr Tyr Cys Asn Ser His Ile Asn Asn Ile 
35 40 45 


Lys Cys Ser Gln Leu Thr Ala Asn Leu Gln Asn Gly Phe Tyr Pro Val 
50 55 60 


Ala Ser Ser Glu Val Gly Leu Val Asn Lys Ser Val Val Leu Leu Pro 
65 70 75 80 


Ser Phe Tyr Ser His Thr Ser Val Asn Ile Thr Ile Asp Leu Gly Met 
85 90 95 


Lys Arg Ser Gly Tyr Gly Gln Pro Ile Ala Ser Thr Leu Ser Asn Ile 
100 105 110 


Thr Leu Pro Met Gln Asp Asn Asn Thr Asp Val Tyr Cys Ile Arg Ser 
115 120 125 


Asn Gln Phe Ser Val Tyr Val His Ser Thr Cys Lys Ser Ser Leu Trp 
130 135 140 


Asp Asp Val Phe Asn Ser Asp Cys Thr Asp Val Leu Tyr Ala Thr Ala 
145 150 155 160 


Val Ile Lys Thr Gly Thr Cys Pro Phe Ser Phe Asp Lys Leu Asn Asn 
165 170 175 


Tyr Leu Thr Phe Asn Lys 
180 
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(2) INFORMATION FOR SEQ ID NO: 46: 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 38 base pairs 
(B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
(D) TOPOLOGY: unknown 


(ii) MOLECULE TYPE: DNA (genomic) 





(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 


TAAATAGGCC TTTAGTGGAC ATGCACTTTT TCAATTGG 38 


(2) INFORMATION FOR SEQ ID NO: 47: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 39 base pairs 
(B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 


TTAGTAGGCC TGTCGAGGCT ATGGGTTGAC CATAACCAC 39 


(2) INFORMATION FOR SEQ ID NO: 48: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 37 base pairs 
(B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 


CAGATCCCGG GTGTACAATC TGGTATGGGT GCTACAG 37 


(2) INFORMATION FOR SEQ ID NO: 49: 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 39 base pairs 
(B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
(D) TOPOLOGY: unknown 


(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 


GTGCCCCCGG GTATGATTGT GCTCGTAACT TGCCTCTTG 39 


(2) INFORMATION FOR SEQ ID NO: 50: 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 43 base pairs 
(B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
(D) TOPOLOGY: unknown 


(ii) MOLECULE TYPE: DNA (genomic) 





(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 


AGCACCCATA CCAGATTGTA CATCTGCAGT GAAATTAAGA TTG 43 


(2) INFORMATION FOR SEQ ID NO: 51: 


Met 


Cys 


Gly 


Glu 


Asn 


65 


Ile 


Asn 


Ser 


(2) 


Asp 


Gly 


ser 


Pro 


Gly 


65 


His 


Cys 


Ile 


Ala 


Cys 


145 


Ser 


Phe 


US 6,372,224 B1 


71 


-continued 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 128 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 


Ile Val Leu Val Thr Cys Leu Leu Phe Ser Tyr 
5 10 


Thr Ser Asn Asn Asp Cys Val Gln Val Asn Val 
20 25 


Asn Glu Asn Ile Ile Lys Asp Phe Leu Phe His 
35 40 


Gly Ser Val Val Val Gly Gly Tyr Tyr Pro Thr 
50 55 60 


Cys Ser Arg Ser Ala Thr Thr Thr Ala Tyr Lys 
70 75 


His Ala Phe Tyr Phe Asp Met Glu Ala Met Glu 
85 90 


Ala Arg Gly Lys Pro Leu Leu Val His Val His 
100 105 


Ile Ile Ile Tyr Ile Ser Ala Tyr Arg Asp Asp 
115 120 
INFORMATION FOR SEQ ID NO: 52: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1101 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 


Val Gln Ser Gly Met Gly Ala Thr Val Phe Ser 
5 10 


Gly Val Ile Leu Glu Ile Ser Cys Tyr Asn Asp 
20 25 


Ser Phe Tyr Ser Tyr Gly Glu Ile Ser Phe Gly 
35 40 


Arg Tyr Cys Tyr Ala Leu Tyr Asn Gly Thr Ala 
50 55: 60 


Thr Leu Pro Pro Ser Val Lys Glu Ile Ala Ile 
70 75 


Phe Tyr Ile Asn Gly Tyr Asn Phe Phe Ser Thr 
85 90 


Ile Ser Phe Asn Leu Thr Thr Gly Asp Ser Gly 
100 105 


Ala Tyr Thr Ser Tyr Thr Asp Ala Leu Val Gln 
115 120 


Ile Lys Lys Val Thr Tyr Cys Asn Ser His Ile 
130 135 140 


Ser Gln Leu Thr Ala Asn Leu Gln Asn Gly Phe 
150 155 


Ser Glu Val Gly Leu Val Asn Lys Ser Val Val 
165 170 


Tyr Ser His Thr Ser Val Asn Ile Thr Ile Asp 


Asn 


Thr 


Thr 


45 


Glu 


Asp 


Asn 


Gly 


val 
125 


Leu 


Thr 


Val 


45 


Leu 


ser 


Phe 


Ala 


Val 


125 


Asn 


Tyr 


Leu 


Leu 


Ser 


Gln 


30 


Phe 


Val 


Phe 


Ser 


Asp 


110 


Gln 


Asn 


val 


30 


Thr 


Lys 


Lys 


Pro 


Phe 


110 


Glu 


Asn 


Pro 


Leu 


Gly 


Val 
15 

Leu 
Lys 
Trp 
Ser 
Thr 
95 


Pro 


Gly 


Thr 


15 


Ser 


Asp 


Tyr 


Trp 


Ile 


95 


Trp 


Asn 


Ile 


Val 


Pro 


175 


Met 


Tl 


Pr 


Gl 


Ty 


As 


80 


Gl 


Va 


Thr 


Glu 


Gly 


Leu 


Gly 


80 


Asp 


Thr 


Thr 


Lys 


Ala 


160 


Ser 


Lys 


72 


Arg 


Leu 


Gln 


225 


Asp 


Ile 


Leu 


cys 


Arg 


305 


Pro 


Ser 


Arg 


Ser 


Ser 


385 


Ala 


Thr 


Tyr 


Gly 


465 


Pro 


val 


Cys 


Thr 


Gly 


545 


Glu 


Thr 


Leu 


Ser 


Pro 


210 


Phe 


val 


Lys 


Thr 


Lys 


290 


Ser 


Ser 


Cys 


Gln 


Gly 


370 


val 


Ile 


His 


Thr 


Cys 


450 


Ala 


Ile 


Gln 


Ser 


Gln 


530 


Ala 


Asn 


Leu 


Gly 


Gly 


195 


Met 


Ser 


Phe 


Thr 


Phe 


275 


Phe 


Leu 


Asp 


Thr 


Thr 


355 


Asp 


Thr 


val 


Trp 


Asn 


435 


Glu 


Leu 


ser 


val 


Arg 


515 


Tyr 


Arg 


Ala 


Asp 


Gly 
595 


180 


Tyr 


Gln 


val 


Asn 


Gly 


260 


Asn 


Asp 


Tyr 


Asn 


Asp 


340 


Asn 


Leu 


Pro 


Gly 


Thr 


420 


Glu 


Pro 


val 


Thr 


Glu 


500 


Tyr 


val 


Leu 


Leu 


Pro 
580 





Leu 


Gly 


Asp 


Tyr 


Ser 


245 


Thr 


Lys 


val 


val 


Ser 


325 


Tyr 


Ser 


Leu 


Cys 


Ala 


405 


Thr 


Arg 


Tle 


Phe 


Gly 


485 


Tyr 


val 


Ser 


Glu 


Lys 


565 


Ile 


Lys 


Gln 


Asn 


val 


230 


Asp 


Cys 


Phe 


Ala 


Ile 


310 


Gly 


Asn 


Thr 


Gly 


Asp 


390 


Met 


Thr 


Thr 


Tle 


Ile 


470 


Asn 


Tle 


Cys 


Ala 


Asn 


550 


Leu 


Tyr 


Asp 


73 


Pro 
Asn 
215 
His 
Cys 
Pro 
Cys 
Ala 
295 
Tyr 
Leu 
Ile 
Leu 
Phe 
375 
val 
Thr 
Pro 
Arg 
Thr 
455 
Asn 
val 
Gln 
Asn 
Cys 
535 
Met 
Ala 


Lys 


Ile 


Ile 


200 


Thr 


Ser 


Thr 


Phe 


Leu 


280 


Arg 


Glu 


His 


Tyr 


Leu 


360 


Lys 


Ser 


Ser 


Asn 


Gly 


440 


Tyr 


val 


Thr 


val 


Gly 


520 


Gln 


Glu 


Ser 


Glu 


Leu 
600 


185 


Ala 


Asp 


Thr 


Asp 


Ser 


265 


Ser 


Thr 


Glu 


Asp 


Gly 


345 


Ser 


Asn 


Ala 


Ile 


Phe 


425 


Thr 


Ser 


Thr 


Ile 


Tyr 


505 


Asn 


Thr 


Ile 


Val 


Trp 


585 


Pro 


Ser 


Val 


Cys 


Val 


250 


Phe 


Leu 


Arg 


Gly 


Leu 


330 


Arg 


Gly 


val 


Gln 


Asn 


410 


Tyr 


Ala 


Asn 


His 


Pro 


490 


Thr 


Pro 


Ile 


Asp 


Glu 


570 


Pro 


Ser 


Thr 


Tyr 


Lys 


235 


Leu 


Asp 


Asn 


Thr 


Asp 


315 


Ser 


Thr 


Leu 


Ser 


Ala 


395 


Ser 


Tyr 


Ile 


Ile 


Ser 


475 


Thr 


Thr 


Arg 


Glu 


Ser 


555 


Ala 


Asn 


His 
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Leu 


Cys 


220 


Ser 


Tyr 


Lys 


Pro 


Asn 


300 


Asn 


val 


Gly 


Tyr 


Asp 


380 


Ala 


Glu 


Tyr 


Asp 


Gly 


460 


Asp 


Asn 


Pro 


Cys 


Gln 


540 


Met 


Phe 


Ile 


Asn 


Ser 


205 


Ile 


Ser 


Ala 


Leu 


Val 


285 


Glu 


Ile 


Leu 


val 


Tyr 


365 


Gly 


val 


Met 


Ser 


Ser 


445 


Val 


Gly 


Phe 


Val 


Asn 


525 


Ala 


Leu 


Asn 


Gly 


Ser 
605 


190 


Asn 


Arg 


Leu 


Thr 


Asn 


270 


Gly 


Gln 


Val 


His 


Gly 


350 


Thr 


val 


Ile 


Leu 


Ile 


430 


Asn 


Cys 


Asp 


Thr 


Ser 


510 


Lys 


Leu 


Phe 


Ser 


Gly 


590 


Lys 


Ile 


Ser 


Trp 


Ala 


255 


Asn 


Ala 


Val 


Gly 


Leu 


335 


Ile 


Ser 


Ile 


Asp 


Gly 


415 


Tyr 


Asp 


Lys 


Val 


Ile 


495 


Ile 


Leu 


Ala 


Val 


Thr 


575 


Ser 


Thr 


Asn 


Asp 


240 


Val 


Tyr 


Asn 


Val 


Val 


320 


Asp 


Tle 


Leu 


Tyr 


Gly 


400 


Leu 


Asn 


val 


Asn 


Gln 


480 


Ser 


Asp 


Leu 


Met 


Ser 


560 


Glu 


Trp 


Lys 


74 
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Tyr Arg Ser Ala Ile Glu Asp Leu Leu Phe Asp Lys Val Val Thr Ser 
610 615 620 


Gly Leu Gly Thr Val Asp Glu Asp Tyr Lys Arg Cys Thr Gly Gly Tyr 
625 630 635 640 


Asp Ile Ala Asp Leu Val Cys Ala Gln Tyr Tyr Asn Gly Ile Met Val 
645 650 655 


Leu Pro Gly Val Ala Asn Asp Asp Lys Met Ala Met Tyr Thr Ala Ser 
660 665 670 





Leu Ala Gly Gly Ile Thr Leu Gly Ala Leu Gly Gly Gly Ala Val Ser 
675 680 685 


Ile Pro Phe Ala Ile Ala Val Gln Ala Arg Leu Asn Tyr Val Ala Leu 
690 695 700 


Gln Thr Asp Val Leu Ser Lys Asn Gln Gln Ile Leu Ala Asn Ala Phe 
705 710 715 720 





Asn Gln Ala Ile Gly Asn Ile Thr Gln Ala Phe Gly Lys Val Asn Asp 
725 730 735 


Ala Ile His Gln Thr Ser Gln Gly Leu Ala Thr Val Ala Lys Ala Leu 
740 745 750 


Ala Lys Val Gln Asp Val Val Asn Thr Gln Gly Gln Ala Leu Ser His 
759 760 765 


Leu Thr Val Gln Leu Gln Asn Asn Phe Gln Ala Ile Ser Ser Ser Ile 
770 775 780 


Ser Asp Ile Tyr Asn Arg Leu Asp Glu Leu Ser Ala Asp Ala Gln Val 
785 790 795 800 


Asp Arg Leu Ile Thr Gly Arg Leu Thr Ala Leu Asn Ala Phe Val Ser 
805 810 815 


Gln Thr Leu Thr Arg Gln Ala Glu Val Arg Ala Ser Arg Gln Leu Ala 
820 825 830 


Lys Asp Lys Val Asn Glu Cys Val Arg Ser Gln Ser Gln Arg Phe Gly 
835 840 845 


Phe Cys Gly Asn Gly Thr His Leu Phe Ser Leu Ala Asn Ala Ala Pro 
850 855 860 


Asn Gly Met Ile Phe Phe His Thr Val Leu Leu Pro Thr Ala Tyr Glu 
865 870 875 880 


Thr Val Thr Ala Trp Ser Gly Ile Cys Ala Ser Asp Gly Asp Arg Thr 
885 890 895 


Phe Gly Leu Val Val Lys Asp Val Gln Leu Thr Leu Phe Arg Asn Leu 
900 905 910 


Asp Asp Lys Phe Tyr Leu Thr Pro Arg Thr Met Tyr Gln Pro Arg Val 
915 920 925 


Ala Thr Ser Ser Asp Phe Val Gln Ile Glu Gly Cys Asp Val Leu Phe 
930 935 940 


Val Asn Ala Thr Val Ile Asp Leu Pro Ser Ile Ile Pro Asp Tyr Ile 
945 950 955 960 


Asp Ile Asn Gln Thr Val Gln Asp Ile Leu Glu Asn Phe Arg Pro Asn 
965 970 975 


Trp Thr Val Pro Glu Leu Pro Leu Asp Ile Phe Asn Ala Thr Tyr Leu 
980 985 990 


Asn Leu Thr Gly Glu Ile Asn Asp Leu Glu Phe Arg Ser Glu Lys Leu 
995 1000 1005 


His Asn Thr Thr Val Glu Leu Ala Ile Leu Ile Asp Asn Ile Asn Asn 
1010 1015 1020 
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Thr Leu Val Asn Leu Glu Trp Leu Asn Arg Ile Glu Thr Tyr Val Lys 
025 1030 1035 1040 


Trp Pro Trp Tyr Val Trp Leu Leu Ile Gly Leu Val Val Ile Phe Cys 
1045 1050 1055 


le Pro Ile Leu Leu Phe Cys Cys Cys Ser Thr Gly Cys Cys Gly Cys 
1060 1065 1070 





le Gly Cys Leu Gly Ser Cys Cys His Ser Ile Cys Ser Arg Arg Arg 
1075 1080 1085 


Phe Glu Ser Tyr Glu Pro Ile Glu Lys Val His Val His 
1090 1095 1100 
(2) INFORMATION FOR SEQ ID NO: 53: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 362 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 


Met Ile Val Leu Val Thr Cys Leu Leu Phe Ser Tyr Asn Ser Val Ile 
Bs 5 10 15 


Cys Thr Ser Asn Asn Asp Cys Val Gln Val Asn Val Thr Gln Leu Pro 
20 25 30 


Gly Asn Glu Asn Ile Ile Lys Asp Phe Leu Phe His Thr Phe Lys Glu 
35 40 45 


Glu Gly Ser Val Val Val Gly Gly Tyr Tyr Pro Thr Glu Val Trp Tyr 
50 55 60 


Asn Cys Ser Arg Ser Ala Thr Thr Thr Ala Tyr Lys Asp Phe Ser Asn 
65 70 15 80 


Ile His Ala Phe Tyr Phe Asp Met Glu Ala Met Glu Asn Ser Thr Gly 
85 90 95 


Asn Ala Arg Gly Lys Pro Leu Leu Val His Val His Gly Asp Pro Val 
100 105 110 


Ser Ile Ile Ile Tyr Ile Ser Ala Tyr Arg Asp Asp Val Gln Gly Arg 
115 120 125 


Pro Leu Leu Lys His Gly Leu Leu Cys Ile Thr Lys Asn Lys Ile Ile 
130 135 140 


Asp Tyr Asn Thr Phe Thr Ser Ala Gln Trp Ser Ala Ile Cys Leu Gly 
145 150 155 160 


Asp Asp Arg Lys Ile Pro Phe Ser Val Ile Pro Thr Gly Asn Gly Thr 
165 170 175 


Lys Ile Phe Gly Leu Glu Trp Asn Asp Asp Tyr Val Thr Ala Tyr Ile 
180 185 190 


Ser Asp Arg Ser His His Leu Asn Ile Asn Asn Asn Trp Phe Asn Asn 
95 200 205 


Val Thr Ile Leu Tyr Ser Arg Ser Ser Thr Ala Thr Trp Gln Lys Ser 
210 215 220 


Ala Ala Tyr Val Tyr Gln Gly Val Ser Asn Phe Thr Tyr Tyr Lys Leu 
225 230 235 240 


Asn Asn Thr Asn Gly Leu Lys Ser Tyr Glu Leu Cys Glu Asp Tyr Glu 
245 250 255 





Cys Cys Thr Gly Tyr Ala Thr Asn Val Phe Ala Pro Thr Val Gly Gly 
260 265 270 


Tyr Ile Pro Asp Gly Phe Ser Phe Asn Asn Trp Phe Met Leu Thr Asn 
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275 280 285 


Ser Ser Thr Phe Val Ser Gly Arg Phe Val Thr Asn Gln Pro Leu Leu 
290 295 300 


Val Asn Cys Leu Trp Pro Val Pro Ser Leu Gly Val Ala Ala Gln Glu 
305 310 315 320 


Phe Cys Phe Glu Gly Ala Gln Phe Ser Gln Cys Asn Gly Val Ser Leu 
325 330 335 


Asn Asn Thr Val Asp Val Ile Arg Phe Asn Leu Asn Phe Thr Thr Asp 
340 345 350 


Val Gln Ser Gly Met Gly Ala Thr Val Phe 
355 360 
(2) INFORMATION FOR SEQ ID NO: 54: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1101 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 


Ala Ala Tyr Val Tyr Gln Gly Val Ser Asn Phe Thr Tyr Tyr Lys Leu 
i 5 10 15 


Asn Asn Thr Asn Gly Leu Lys Ser Tyr Glu Leu Cys Glu Asp Tyr Glu 
20 25 30 


Cys Cys Thr Gly Tyr Ala Thr Asn Val Phe Ala Pro Thr Val Gly Gly 
35 40 45 


Tyr Ile Pro Asp Gly Phe Ser Phe Asn Asn Trp Phe Met Leu Thr Asn 
50 55 60 


Ser Ser Thr Phe Val Ser Gly Arg Phe Val Thr Asn Gln Pro Leu Leu 
65 70 75 80 


Val Asn Cys Leu Trp Pro Val Pro Ser Leu Gly Val Ala Ala Gln Glu 
85 90 95 


Phe Cys Phe Glu Gly Ala Gln Phe Ser Gln Cys Asn Gly Val Ser Leu 
100 105 110 


Asn Asn Thr Val Asp Val Ile Arg Phe Asn Leu Asn Phe Thr Thr Asp 
115 120 125 


Val Gln Ser Gly Met Gly Ala Thr Val Phe Ser Leu Asn Thr Thr Gly 
130 135 140 


Gly Val Ile Leu Glu Ile Ser Cys Tyr Asn Asp Thr Val Ser Glu Ser 
145 150 155 160 


Ser Phe Tyr Ser Tyr Gly Glu Ile Ser Phe Gly Val Thr Asp Gly Pro 
165 170 175 


Arg Tyr Cys Tyr Ala Leu Tyr Asn Gly Thr Ala Leu Lys Tyr Leu Gly 
180 185 190 


Thr Leu Pro Pro Ser Val Lys Glu Ile Ala Ile Ser Lys Trp Gly His 
195 200 205 


Phe Tyr Ile Asn Gly Tyr Asn Phe Phe Ser Thr Phe Pro Ile Asp Cys 
210 215 220 


Ile Ser Phe Asn Leu Thr Thr Gly Asp Ser Gly Ala Phe Trp Thr Ile 
225 230 235 240 


Ala Tyr Thr Ser Tyr Thr Asp Ala Leu Val Gln Val Glu Asn Thr Ala 
245 250 255 


Ile Lys Lys Val Thr Tyr Cys Asn Ser His Ile Asn Asn Ile Lys Cys 
260 265 270 


Ser 


Ser 


Tyr 


305 


Ser 


Pro 


Phe 


val 


Lys 


385 


Thr 


Lys 


Ser 


Ser 


Cys 


465 


Gln 


Gly 


Val 


Ile 


His 


545 


Thr 


Cys 


Ala 


Ile 


Gln 


625 


Ser 


Gln 


Ala 


Gln 


Glu 


290 


Ser 


Gly 


Met 


Ser 


Phe 


370 


Thr 


Phe 


Phe 


Leu 


Asp 


450 


Thr 


Thr 


Asp 


Thr 


val 


530 


Trp 


Asn 


Glu 


Leu 


Ser 


610 


val 


Arg 


Tyr 


Arg 


Leu 


275 


val 


His 


Tyr 


Gln 


val 


355 


Asn 


Gly 


Asn 


Asp 


Tyr 


435 


Asn 


Asp 


Asn 


Leu 


Pro 


515 


Gly 


Thr 


Glu 


Pro 


val 


595 


Thr 


Glu 


Tyr 


val 


Leu 
675 


Thr 


Gly 


Thr 


Gly 


Asp 


340 


Tyr 


Ser 


Thr 


Lys 


val 


420 


val 


Ser 


Tyr 


Ser 


Leu 


500 


Cys 


Ala 


Thr 


Arg 


Ile 


580 


Phe 


Gly 


Tyr 


val 


Ser 


660 


Glu 


Ala 


Leu 


Ser 


Gln 


325 


Asn 


val 


Asp 


cys 


Phe 


405 


Ala 


Ile 


Gly 


Asn 


Thr 


485 


Gly 


Asp 


Met 


Thr 


Thr 


565 


Ile 


Ile 


Asn 


Ile 


Cys 


645 


Ala 


Asn 


Asn 


val 


val 


310 


Pro 


Asn 


His 


cys 


Pro 


390 


Cys 


Ala 


Tyr 


Leu 


Ile 


470 


Leu 


Phe 


val 


Thr 


Pro 


550 


Arg 


Thr 


Asn 


val 


Gln 


630 


Asn 


Cys 


Met 


81 


Leu 
Asn 
295 
Asn 
Ile 
Thr 
Ser 
Thr 
375 
Phe 
Leu 
Arg 
Glu 
His 
455 
Tyr 
Leu 
Lys 
Ser 
Ser 
535 
Asn 
Gly 
Tyr 
val 
Thr 
615 
val 
Gly 


Gln 


Glu 


Gln 


280 


Lys 


Ile 


Ala 


Asp 


Thr 


360 


Asp 


Ser 


Ser 


Thr 


Glu 


440 


Asp 


Gly 


Ser 


Asn 


Ala 


520 


Ile 


Phe 


Thr 


Ser 


Thr 


600 


Tle 


ye 


Asn 


Thr 


Ile 
680 


Asn 


Ser 


Thr 


Ser 


Val 


345 


Cys 


Val 


Phe 


Leu 


Arg 


425 


Gly 


Leu 


Arg 


Gly 


val 


505 


Gln 


Asn 


Tyr 


Ala 


Asn 


585 


His 


Pro 


Thr 


Pro 


Ile 


665 


Asp 


Gly 


Val 


Ile 


Thr 


330 


Tyr 


Lys 


Leu 


Asp 


Asn 


410 


Thr 


Asp 


Ser 


Thr 


Leu 


490 


Ser 


Ala 


Ser 


Tyr 


Ile 


570 


Ile 


Ser 


Thr 


Thr 


Arg 


650 


Glu 


Ser 


Phe 


Val 


Asp 


315 


Leu 


Cys 


Ser 


Tyr 


Lys 


395 


Pro 


Asn 


Asn 


Val 


Gly 


475 


Tyr 


Asp 


Ala 


Glu 


Tyr 


555 


Asp 


Gly 


Asp 


Asn 


Pro 


635 


Cys 


Gln 


Met 
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Tyr 


Leu 


300 


Leu 


Ser 


Ile 


Ser 


Ala 


380 


Leu 


val 


Glu 


Ile 


Leu 


460 


val 


Tyr 


Gly 


val 


Met 


540 


ser 


ser 


Val 


Gly 


Phe 


620 


Val 


Asn 


Ala 


Leu 


Pro Val Ala 


285 


Leu 


Gly 


Asn 


Arg 


Leu 


365 


Thr 


Asn 


Gly 


Gln 


val 


445 


His 


Gly 


Thr 


val 


Ile 


525 


Leu 


Tle 


Asn 


Cys 


Asp 


605 


Thr 


Ser 


Lys 


Leu 


Phe 
685 


Pro 


Met 


Ile 


Ser 


350 


Trp 


Ala 


Asn 


Ala 


val 


430 


Gly 


Leu 


Ile 


Ser 


Ile 


510 


Asp 


Gly 


Tyr 


Asp 


Lys 


590 


Val 


Ile 


Ile 


Leu 


Ala 


670 


Val 


Ser 


Lys 


Thr 


335 


Asn 


Asp 


Val 


Tyr 


Asn 


415 


Val 


val 


Asp 


Ile 


Leu 


495 


Tyr 


Gly 


Leu 


Asn 


Val 


575 


Asn 


Gln 


Ser 


Asp 


Leu 


655 


Met 


Ser 


Ser 


Phe 


Arg 


320 


Leu 


Gln 


Asp 


Ile 


Leu 


400 


Cys 


Arg 


Pro 


Ser 


Arg 


480 


Ser 


Ser 


Ala 


Thr 


Tyr 


560 


Asp 


Gly 


Pro 


Val 


Cys 


640 


Thr 


Gly 


Glu 


82 


Asn Ala Leu 
690 


Leu Asp Pro 
705 


Gly Gly Leu 
Arg Ser Ala 
Leu Gly Thr 
755 


Ile Ala Asp 


Pro Gly Val 
785 





Ala Gly Gly 
Pro Phe Ala 
Thr Asp Val 


835 


Gln Ala Ile 
850 


Ile His Gln 
865 


Lys Val Gln 
Thr Val Gln 
Asp Ile Tyr 


915 


Arg Leu Ile 
930 


Thr Leu Thr 
945 


Asp Lys Val 
Cys Gly Asn 
Gly Met Ile 


995 


Val Thr Ala 
1010 


Gly Leu Val 
1025 


Asp Lys Phe 


Thr Ser Ser 


Lys 


Ile 


Lys 


Ile 


740 


val 


Leu 


Ala 


Ile 


Ile 


820 


Leu 


Gly 


Thr 


Asp 


Leu 


900 


Asn 


Thr 


Arg 


Asn 


Gly 


980 


Phe 


Trp 


Val 


Tyr 


Asp 


1060 


Leu 


Tyr 


Asp 


725 


Glu 


Asp 


val 


Asn 


Thr 


805 


Ala 


Ser 


Asn 


Ser 


val 


885 


Gln 


Arg 


Gly 


Gln 


Glu 


965 


Thr 


Phe 


Ser 


Lys 


Leu 


1045 


Phe 


Asn Ala Thr Val Ile 


1075 


Ile Asn Gln Thr Val 


1090 


83 


Ala Ser Val Glu Ala 
695 


Lys Glu Trp Pro Asn 
710 


Ile Leu Pro Ser His 
730 


Asp Leu Leu Phe Asp 
745 


Glu Asp Tyr Lys Arg 
760 


Cys Ala Gln Tyr Tyr 
775 


Asp Asp Lys Met Ala 
790 


Leu Gly Ala Leu Gly 
810 


Val Gln Ala Arg Leu 
825 


Lys Asn Gln Gln Ile 
840 


Ile Thr Gln Ala Phe 
855 


Gln Gly Leu Ala Thr 
870 


Val Asn Thr Gln Gly 
890 


Asn Asn Phe Gln Ala 
905 


Leu Asp Glu Leu Ser 
920 


Arg Leu Thr Ala Leu 
935 


Ala Glu Val Arg Ala 
950 


Cys Val Arg Ser Gln 
970 


His Leu Phe Ser Leu 
985 


His Thr Val Leu Leu 
1000 


Gly Ile Cys Ala Ser 
1015 


Asp Val Gln Leu Thr 
1030 


Thr Pro Arg Thr Met 
1050 


Val Gln Ile Glu Gly 
1065 


Asp Leu Pro Ser Ile 
1080 


Gln Asp Ile Leu Glu 
1095 
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Phe 


Ile 


715 


Asn 


Lys 


Cys 


Asn 


Met 


795 


Gly 


Asn 


Leu 


Gly 


Val 


875 


Gln 


Ile 


Ala 


Asn 


Ser 


955 


Ser 


Ala 


Pro 


Asp 


-continued 


Asn 


700 


Gly 


Ser 


Val 


Thr 


Gly 


780 


Tyr 


Gly 


Tyr 


Ala 


Lys 


860 


Ala 


Ala 


Ser 


Asp 


Ala 


940 


Arg 


Gln 


Asn 


Thr 


Gly 
1020 


Ser Thr Glu 
Gly Ser Trp 
Lys Arg Lys 


735 


Val Thr Ser 
750 


Gly Gly Tyr 
765 


Ile Met Val 
Thr Ala Ser 
Ala Val Ser 


815 


Val Ala Leu 
830 


Asn Ala Phe 
845 


Val Asn Asp 
Lys Ala Leu 
Leu Ser His 


895 


Ser Ser Ile 
910 


Ala Gln Val 
925 


Phe Val Ser 


Gln Leu Ala 


Arg Phe Gly 
975 


Ala Ala Pro 
990 


Ala Tyr Glu 
1005 


Asp Arg Thr 


Leu Phe Arg Asn Leu 


1035 


Tyr Gln Pro Arg Val 


1055 


Cys Asp Val Leu Phe 


1070 


Ile Pro Asp Tyr Ile 


1085 


Asn Phe Arg 


1100 


Thr 


Leu 


720 


Tyr 


Gly 


Asp 


Leu 


Leu 


800 


Ile 


Gln 


Asn 


Ala 


Ala 


880 


Leu 


Ser 


Asp 


Gln 


Lys 


960 


Phe 


Asn 


Thr 


Phe 


Asp 


1040 


Ala 


Val 


Asp 


84 


(2) INFORMATION FOR SEQ ID NO: 


85 


55s 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 701 base pairs 


(B) TYPE: 
(C) STRANDEDNESS: 


nucleic acid 
double 


(D) TOPOLOGY: unknown 


(ii) MOLECULE TYPE: DNA (genomic) 
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-continued 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 


TCAACCATTA TTIGGTTAATT GTTTGTGGCC AGTGCCCAGT CTTGGTGTCG 


ATTTTGTTTT 


GGATGTCATT 


AGTATTTTCA 


AGTGAGTGAG 


GCGTTACTGT 


TAGTGTCAAG 


CTTTAGCACT 


ATTTTGGACA 


TATTAAAAAG 


TGCTAATTTG 


GAGTGTTGTG 


GAAGGTGCGC 


AGATTCAACC 


CTGAATACAA 


TCAAGTTTCT 


TACGCACTCT 


GAAATTGCTA 


TTTCCTATTG 


ATTGCTTACA 


GTGACGTATT 


CAAAATGGAT 


TTACTACCTA 


AGTTTAGCCA 


TTAATTTTAC 


CAGGTGGTGT 


ACAGTTATGG 


ATAATGGCAC 


TTAGTAAGTG 


ATTGTATATC 


CATCGTACAC 


GTAACAGTCA 


TTTATCCTGT 


GTTTCTATTC 


ATGTAATGGT 


CACAGATGTA 


CATTCTTGAG 


TGAAATTTCA 


GGCTCTTAAG 


GGGCCATTTT 


TTTTAATTTA 


TGACGCATTA 


CATTAATAAC 


TGCTTCAAGT 


ACATACCAGT 


(2) INFORMATION FOR SEQ ID NO: 56: 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1401 base pairs 
(B) TYPE: nucleic acid 
(C) STRANDEDNESS: double 
(D) TOPOLOGY: unknown 


(ii) MOLECULE TYPE: DNA (genomic) 


GTGTCTTTAA 


CAATCTGGTA 


ATTTCTTGTT 


TTCGGCGTAA 


TATTTAGGAA 


TATATTAATG 


ACCACTGGTG 


GTACAAGTTG 


ATTAAATGTT 


GAAGTTGGTC 


G 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 


AGCACCGGTA ATGTCACGAT ACCTACAAAT 


CAGGTTTACA 


AGATGCAATA 


GCAATGGGTG 


GCCCTTAAAT 


AAAGAATGGC 


CACAACAGCA 


ACATCTGGCT 


GCTGACTTAG 


GATGACAAGA 


GGTGGTGGCG 


GCTCTACAAA 


GCTATTGGTA 


CAAGGTCTTG 


GGGCAAGCTT 


CTACACCGGT 


AATTGTTAAC 


CCAGACTTGA 


TGGCATCTGT 


CTAACATTGG 


AACGTAAGTA 


TAGGI 


TACAGT 


TGTGTGCACA 


TGGCT 


TATGTA 


CAGTGTCTAT 


CTGAT 


ACATT 


CTACT 


[GTATT 


TACACA 





[GTTGC 


TAAGCCACCT 


GTCAATAGAT 


GCAATACGTT 


AAACATGGAG 


TGAAGCATTC 


TGGTTCTTGG 


CCGGTCGGCT 


TGATGAAGAT 


ATATTACAAT 


CACTGCATCT 


ACCTTTTGCA 


GAGCAAGAAC 


GGCATTTGGT 


TAAAGCATTG 


AACAGTACAA 


TTTACCATAT 


TGTTCAAGGT 


TCTGCATGTC 


ATTGATTCCA 


AATAGTACGG 


CTAGGAGGTT 


ATAGAAGATT 


TATAAACGTT 


GGCATCATGG 


CTTGCAGGTG 


ATAGCAGTTC 


CAGCAGATCC 


AAGGTTAATG 


GCAAAAGTGC 


TTGCAAAATA 


CTGTGCAAGT 


ACGTTTGCAA 


AAACTATTGA 


TGTTGTTTGT 


AAACTTTAGA 


TAAAAGACAT 


TGCTTTTTGA 


GTACAGGTGG 


TGCTACCTGG 


GTATAACATT 


AAGCCAGACT 


TGGCTAATGC 


ATGCTATACA 


AAGATGTTGT 


ATTTCCAAGC 


CAGCACAAGA 


ACAATACAGT 


TGGGTGCTAC 


ATAATGATAC 


CTGATGGACC 


CATTACCACC 


GTTACAATTT 


ATAGTGGAGC 


AAAACACAGC 


CTCAACTTAC 


TTGTCAATAA 


TGAGTACATT 


TGGTAACCCT 


GCAAGCACTT 


TTCGGAAAAT 


TCCTATTTAC 


ATTGCCATCT 


TAAGGTTGTA 


TTATGACATA 


TGTAGCTAAT 


AGGTGCACTT 


TAATTATGTT 


TTTCAATCAA 


TCAAACGTCA 


TAACACACAA 


CATTAGTAGT 


60 


120 


180 


240 


300 


360 


420 


480 


540 


600 


660 


701 


60 


120 


180 


240 


300 


360 


420 


480 


540 


600 


660 


720 


780 


840 


900 


86 


TCCATTAGTG 


CTGATTACAG 


GCAGAGGTTA 


CAATCTCAGA 


GCACCAAATG 


ACGGCCTGGT 


GATGTCCAGT 


ATGTATCAGC 


TIGTTTGTTA 


(2) INFORMATION FOR SEQ ID NO: 


ACATT 


87 


[ITATAA CAGGCTTGAT 


GAAGACTTAC AGCACTTAAT 


GGGCT 


GATTI 


TAGCAG ACAGCTTGCT 


[GGATT CTGTGGTAAT 


GCATGATCTT CTTTCACACA 





CAGGT 


[TATTTG TGCATCAGAT 


TGACGCTGTT TCGCAATCTA 


CTAGAGTTGC AACTAGTTCT 


ATGCAACTGT A 


GAATTGAGTG 


GCATTTGTGT 


AAAGACAAGG 


GGTACACATT 


GTGCTATTAC 


GGCGATCGTA 


GATGACAAAT 


GATTTTGTTC 


57: 


(i) SEQUENCE CHARACTERISTICS: 


(A) LENGTH: 


(B) TYPE: 
(D) TOPOLOGY: unknown 


amino acid 


(ii) MOLECULE TYPE: protein 


250 amino acids 
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-continued 


CTGATGCACA 


CTCAGACTTT 


TAAATGAATG 


TATTTTCACT 


CAACAGCTTA 


CTTTTGGACT 


TCTATTTGAC 


AAATTGAAGG 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 


AGTTGACAGG 


AACCAGACAA 


CGTTAGGTCT 


TGCAAATGCA 


TGAAACCGTG 


TGTTGTTAAG 


TCCCAGAACT 


ATGTGATGTG 


Met Ile Val Leu Val Thr Cys Leu Leu Phe Ser Tyr Asn Ser Val Ile 


Cys Thr Ser 
Gly Asn Glu 

35 
Glu Gly Ser 
Asn Cys Ser 
65 


Ile His Ala 
Asn Ala Arg 
Ser Ile Ile 


115 


Pro Leu Leu 
130 


Asp Tyr Asn 
145 


Asp Asp Arg 
Lys Ile Phe 
Ser Asp Arg 


195 


Val Thr Ile 
210 


Ala Ala Tyr 
225 


Asn Asn Thr 


Asn 


20 


Asn 


val 


Arg 


Phe 


Gly 


100 


Ile 


Lys 


Thr 


Lys 


Gly 


180 


Ser 


Leu 


Val 


Asn 


5 


Asn 


Ile 


val 


Ser 


Tyr 


85 


Lys 


Tyr 


His 


Phe 


Ile 


165 


Leu 


His 


Tyr 


Tyr 


Gly 
245 


Asp Cys Val 
Ile Lys Asp 
40 


Val Gly Gly 
5B 


10 


Gln Val Asn 
25 


Phe Leu Phe 


Tyr Tyr Pro 


Ala Thr Thr Thr Ala Tyr 


70 


75 


Phe Asp Met Glu Ala Met 


90 


Pro Leu Leu Val His Val 


105 


Ile Ser Ala Tyr Arg Asp 


120 


Gly Leu Leu Cys Ile Thr 


135 


Thr Ser Ala Gln Trp Ser 


150 


155 


Pro Phe Ser Val Ile Pro 


170 


Glu Trp Asn Asp Asp Tyr 


His Leu Asn 
200 


Ser Arg Ser 
215 


Gln Gly Val 
230 


185 


Ile Asn Asn 


Ser Thr Ala 


Ser Asn Phe 


235 


Leu Lys Ser Tyr Glu 


250 


15 


Val Thr Gln Leu Pro 


30 


His Thr Phe Lys Glu 


45 


Thr Glu Val 


60 


Lys Asp Phe 
Glu Asn Ser 
His Gly Asp 


110 


Asp Val Gln 


125 


Lys Asn Lys 


140 


Ala Ile Cys 


Thr Gly Asn 


Val Thr Ala 


190 


Asn Trp Phe 


205 


Thr Trp Gln 


220 


Thr Tyr Tyr 


Trp Tyr 
Ser Asn 
80 


Thr Gly 
95 


Pro Val 


Gly Arg 


Ile Ile 


Leu Gly 


160 


Gly Thr 
175 

Tyr Ile 
Asn Asn 


Lys Ser 


Lys Le 
240 


960 


1020 


1080 


1140 


1200 


1260 


1320 


1380 


1401 


88 
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-continued 
(2) INFORMATION FOR SEQ ID NO: 58: 


(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 201 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 


(ii) MOLECULE TYPE: protein 





(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 


Ser Phe Asn Leu Thr Thr Gly Asp Ser Gly Ala Phe Trp Thr Ile Ala 
1 5 10 15 


Tyr Thr Ser Tyr Thr Asp Ala Leu Val Gln Val Glu Asn Thr Ala Ile 
20 25 30 


Lys Lys Val Thr Tyr Cys Asn Ser His Ile Asn Asn Ile Lys Cys Ser 
35 40 45 


Gln Leu Thr Ala Asn Leu Gln Asn Gly Phe Tyr Pro Val Ala Ser Ser 
50 55 60 


Glu Val Gly Leu Val Asn Lys Ser Val Val Leu Leu Pro Ser Phe Tyr 
65 70 75 80 


Ser His Thr Ser Val Asn Ile Thr Ile Asp Leu Gly Met Lys Arg Ser 
85 90 os 


Gly Tyr Gly Gln Pro Ile Ala Ser Thr Leu Ser Asn Ile Thr Leu Pro 
100 105 110 


Met Gln Asp Asn Asn Thr Asp Val Tyr Cys Ile Arg Ser Asn Gln Phe 
115 120 125 


Ser Val Tyr Val His Ser Thr Cys Lys Ser Ser Leu Trp Asp Asp Val 
130 135 140 


Phe Asn Ser Asp Cys Thr Asp Val Leu Tyr Ala Thr Ala Val Ile Lys 
145 150 155 160 


Thr Gly Thr Cys Pro Phe Ser Phe Asp Lys Leu Asn Asn Tyr Leu Thr 
165 170 175 


Phe Asn Lys Phe Cys Leu Ser Leu Asn Pro Val Gly Ala Asn Cys Lys 
180 185 190 


Phe Asp Val Ala Ala Arg Thr Arg Thr 
195 200 
(2) INFORMATION FOR SEQ ID NO: 59: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 251 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: protein 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 


Glu Asn Met Glu Ile Asp Ser Met Leu Phe Val Ser Glu Asn Ala Leu 
1 5 10 15 


Lys Leu Ala Ser Val Glu Ala Phe Asn Ser Thr Glu Thr Leu Asp Pro 
20 25 30 


Ile Tyr Lys Glu Trp Pro Asn Ile Gly Gly Ser Trp Leu Gly Gly Leu 
35 40 45 


Lys Asp Ile Leu Pro Ser His Asn Ser Lys Arg Lys Tyr Arg Ser Ala 
50 35 60 


Ile Glu Asp Leu Leu Phe Asp Lys Val Val Thr Ser Gly Leu Gly Thr 
65 70 75 80 


Val Asp Glu Asp Tyr Lys Arg Cys Thr Gly Gly Tyr Asp Ile Ala Asp 
85 90 95 


Leu 


Ala 


Ile 


Ile 


145 


Leu 


Gly 


Thr 


Asp 


Leu 


225 


Asn 


val 


Asn 


Thr 


130 


Ala 


Ser 


Asn 


Ser 


val 


210 


Gln 


Arg 


Cys 


Asp 


115 


Leu 


val 


Lys 


Ile 


Gln 


195 


val 


Asn 


Leu 


Ala 


100 


Asp 


Gly 


Gln 


Asn 


Thr 


180 


Gly 


Asn 


Asn 


Asp 


Gln 


Lys 





Leu 


Thr 


Phe 


Tyr 


Met 


Leu 


Arg 
150 


G 





n 


230 


Glu Leu 


245 


91 


Tyr 
Ala 
Gly 
135 
Leu 
Ile 
Phe 
Thr 
Gly 
215 


Ala 


Ser 


Asn 


Met 


120 


Gly 


Asn 


Leu 


Gly 


val 


200 


Gln 


Ile 


Ala 


Gly 


105 


Tyr 


Gly 


Tyr 


Ala 


Lys 


185 


Ala 


Ala 


Ser 


Asp 


Ile 


Thr 


Ala 


Val 


Asn 


170 


Val 


Lys 


Leu 


Ser 


Ala 
250 


Met 


Ala 


Val 


Ala 


155 


Ala 


Asn 


Ala 


Ser 


Ser 


235 


Gln 
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-continued 


Val 


Ser 


Ser 


140 


Leu 


Phe 


Asp 


Leu 


His 


220 


Ile 


Leu 


Leu 


125 


Ile 


Gln 


Asn 


Ala 


Ala 


205 


Leu 


Ser 


PEO 


110 


Ala 


Pro 


Thr 


Gln 


Ile 


190 


Lys 


Thr 


Asp 


Gly Val 


Gly Gly 


Phe Ala 


Asp Val 


160 


Ala Ile 
175 

His Gln 
Val Gln 


Val Gln 


Ile Tyr 
240 





What is claimed is: 


1. Avaccine composition comprising an isolated S protein 
of canine coronavirus (CCV) strain 1-71 (SEQ ID NO:2), 


useful to immunize a dog against CCV. 
2. A vaccine composition according to claim 1 wherein 35 


said S protein further comprises a fusion protein. 


30 


3. A vaccine composition according to claim 1 further 
comprising an immunogenic amount of one or more addi- 
tional antigens. 

4. A method of treating infection in dogs by canine 
coronavirus, comprising treating a dog with a vaccine com- 
position of claim 1. 


